dmlc / gluon-nlp

NLP made easy
https://nlp.gluon.ai/
Apache License 2.0
2.56k stars 538 forks source link

[Sparse Attention][Performance] Accelerate the performance of sparse attention + Benchmark #1397

Open sxjscience opened 3 years ago

sxjscience commented 3 years ago

We are having ongoing efforts about supporting sparse attention in GluonNLP: https://github.com/dmlc/gluon-nlp/pull/1395. To better accelerate related kernels, we can compare the performance of these potential solutions, including:

sxjscience commented 3 years ago

@ZiyueHuang Created the issue here about how we may use TVM to accelerate the speed.