microsoft MInference issues

microsoft / MInference

To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy.

https://aka.ms/MInference

MIT License

284 stars 11 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

[Bug]: MInference必须使用fla-attention吗？加速推理，A6000服务器不支持flas-attention

#24 yawzhe closed 22 hours ago
1
[Question]: Is A6000 supported?

#23 yawzhe opened 1 day ago
1
Feature(MInference): update HF demo information, thanks @AK's sponsoring

#22 iofu728 closed 1 day ago
0
[Question]: vertical slash pattern

#21 SimJeg opened 2 days ago
1
[Question]: Question about KV-cache storage

#20 DerrickYLJ opened 3 days ago
3
add vllm support for 0.4.2 and 0.4.3

#19 liyucheng09 closed 2 days ago
0
[Question]: MInference Pre filling is slower than the vllm original version

#18 junior-zsy opened 4 days ago
1
[Question]: Some questions on the code

#17 cyLi-Tiger opened 4 days ago
4
[Question]: For the tests such as RULER and InfiniteBench mentioned in the paper, what datasets are used to search for patterns?

#16 hijkzzz opened 4 days ago
5
Prerelease(MInference): update version

#15 iofu728 closed 4 days ago
0
Hotfix(MInference): fix the configs in pip

#14 iofu728 closed 4 days ago
0
[Question]: python run_vllm.py TypeError: 'type' object is not subscriptable

#13 junior-zsy closed 2 days ago
6
[Question]: pip install minference error: cannot import name 'packaging' from 'pkg_resources'

#12 junior-zsy closed 4 days ago
1
Feature(MInference): add bdist cache

#11 iofu728 closed 5 days ago
0
Hotfix(Minference): fix the yaml

#10 iofu728 closed 5 days ago
0
Hotfix(MInference): fix the yaml

#9 iofu728 closed 5 days ago
0
Hotfix(MInference): fix the pip setup

#8 iofu728 closed 5 days ago
0
[Question]: Hope to supplement the situation of increasing HBM usage with the context.

#7 Arcmoon-Hu closed 5 days ago
2
Hotfix(MInference): fix the pip setup issue

#6 iofu728 closed 5 days ago
0
Feature(MInference): add arXiv paper

#5 iofu728 closed 6 days ago
0
Feature(MInference): fix unittest

#4 iofu728 closed 6 days ago
0
Doc(MInference): update paper information

#3 iofu728 closed 6 days ago
0
PreRelease: v0.1.0

#2 iofu728 closed 6 days ago
0
Action required: migrate or opt-out of migration to GitHub inside Microsoft

#1 microsoft-github-policy-service[bot] closed 2 weeks ago
3