issues
search
microsoft
/
MInference
To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy.
https://aka.ms/MInference
MIT License
284
stars
11
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[Bug]: MInference必须使用fla-attention吗?加速推理,A6000服务器不支持flas-attention
#24
yawzhe
closed
22 hours ago
1
[Question]: Is A6000 supported?
#23
yawzhe
opened
1 day ago
1
Feature(MInference): update HF demo information, thanks @AK's sponsoring
#22
iofu728
closed
1 day ago
0
[Question]: vertical slash pattern
#21
SimJeg
opened
2 days ago
1
[Question]: Question about KV-cache storage
#20
DerrickYLJ
opened
3 days ago
3
add vllm support for 0.4.2 and 0.4.3
#19
liyucheng09
closed
2 days ago
0
[Question]: MInference Pre filling is slower than the vllm original version
#18
junior-zsy
opened
4 days ago
1
[Question]: Some questions on the code
#17
cyLi-Tiger
opened
4 days ago
4
[Question]: For the tests such as RULER and InfiniteBench mentioned in the paper, what datasets are used to search for patterns?
#16
hijkzzz
opened
4 days ago
5
Prerelease(MInference): update version
#15
iofu728
closed
4 days ago
0
Hotfix(MInference): fix the configs in pip
#14
iofu728
closed
4 days ago
0
[Question]: python run_vllm.py TypeError: 'type' object is not subscriptable
#13
junior-zsy
closed
2 days ago
6
[Question]: pip install minference error: cannot import name 'packaging' from 'pkg_resources'
#12
junior-zsy
closed
4 days ago
1
Feature(MInference): add bdist cache
#11
iofu728
closed
5 days ago
0
Hotfix(Minference): fix the yaml
#10
iofu728
closed
5 days ago
0
Hotfix(MInference): fix the yaml
#9
iofu728
closed
5 days ago
0
Hotfix(MInference): fix the pip setup
#8
iofu728
closed
5 days ago
0
[Question]: Hope to supplement the situation of increasing HBM usage with the context.
#7
Arcmoon-Hu
closed
5 days ago
2
Hotfix(MInference): fix the pip setup issue
#6
iofu728
closed
5 days ago
0
Feature(MInference): add arXiv paper
#5
iofu728
closed
6 days ago
0
Feature(MInference): fix unittest
#4
iofu728
closed
6 days ago
0
Doc(MInference): update paper information
#3
iofu728
closed
6 days ago
0
PreRelease: v0.1.0
#2
iofu728
closed
6 days ago
0
Action required: migrate or opt-out of migration to GitHub inside Microsoft
#1
microsoft-github-policy-service[bot]
closed
2 weeks ago
3