FMInference FlexLLMGen issues

FMInference / FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.

Apache License 2.0

9.21k stars 547 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

ValueError: cannot reshape array of size 0 into shape (7168,28672)

#41 progressionnetwork opened 1 year ago
1
Add Erebus and GALACTICA support

#40 Sumanai opened 1 year ago
11
Resolve typos in comments and README

#39 tomaarsen closed 1 year ago
0
CPU/GPU transfer

#38 sshleifer closed 1 year ago
2
Revert "replace Argparse by Fire"

#37 Ying1123 closed 1 year ago
0
[Multi-line Chatbot] Multiple line chat answers cut off?

#36 SoftologyPro opened 1 year ago
3
What's differents in FlexGen and ColossalAI ?

#35 Kuri-su closed 1 year ago
6
Just a suggestion: Think about what Automatic1111 did to Stable Diffusion

#34 cmp-nct opened 1 year ago
2
Unable to run the benchmark

#33 fungiboletus closed 1 year ago
2
Update README

#32 merrymercy closed 1 year ago
0
Update README.md

#31 zhangce closed 1 year ago
0
Update README.md

#30 merrymercy closed 1 year ago
0
PermissionError on delete

#29 xaedes closed 1 year ago
1
Pass over README

#28 DanFu09 closed 1 year ago
1
Can I use FlaxGen's offloading and compression without caching?

#27 yonikremer closed 1 year ago
2
something wrong in the google colab

#26 azoth07 opened 1 year ago
0
Fix typos in README

#25 Calamari closed 1 year ago
1
Question: FlexGen seems slower than simple CPU code, am I missing something? [see discussion]

#24 justheuristic closed 1 year ago
19
Update README.md

#23 keroro824 closed 1 year ago
0
Is this support korean??

#22 waikoreaweatherpjt closed 1 year ago
1
replace Argparse by Fire

#21 Borda closed 1 year ago
2
Doesn't seem to obey --path argument, instead try to download to .cache again

#20 hsaito opened 1 year ago
4
Suggestion: Add support for different decoding strategies (Top P)

#19 anujnayyar1 opened 1 year ago
2
fix: use perf counter for benchmark timing

#18 kemingy closed 1 year ago
0
Suggestion: Add Bloom support

#17 robinsongh381 opened 1 year ago
1
Add __init__.py

#16 shughes-uk closed 1 year ago
0
Update README.md

#15 eltociear closed 1 year ago
0
Link to paper.pdf is broken

#14 pdh closed 1 year ago
1
How can I use this project for my own model? Or what are the key lines of code?

#13 guotong1988 closed 1 year ago
1
This project is software or hardware?

#12 guotong1988 closed 1 year ago
1
Out-of-memory during weight download and conversion

#11 xloem closed 1 year ago
9
Support opt-iml

#10 Ying1123 closed 1 year ago
0
Suggestion: Add GPT-NeoX 20B support

#9 ElleLeonne closed 1 year ago
0
Improve readme & chatbot

#8 Ying1123 closed 1 year ago
0
Update README.md

#7 keroro824 closed 1 year ago
0
Offloading on Windows?

#6 akhilshastrinet closed 1 year ago
2
Typo in README

#5 eazel7 closed 1 year ago
0
3090

#4 random452 closed 1 year ago
3
Why is offloading necessary at all?

#3 amogkam closed 1 year ago
3
Discord Link Is Broken

#2 Marviel closed 1 year ago
2
Support for RWKV language model

#1 BlinkDL opened 1 year ago
0