issues
search
FMInference
/
FlexLLMGen
Running large language models on a single GPU for throughput-oriented scenarios.
Apache License 2.0
9.2k
stars
547
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
FlexGen for Data wrangle Tasks.
#91
BinhangYuan
closed
1 year ago
0
M1/M2 Mac (MPS) support +
#90
con3code
closed
1 year ago
0
Updates on README.md
#89
zhangce
closed
1 year ago
0
Update README.md
#88
nicholasachow
closed
1 year ago
0
Where is the chatbot? I miss it!
#87
fuhengwu2021
closed
1 year ago
7
Soft lockup after running flex_opt
#86
zhang677
opened
1 year ago
1
Update readme
#85
Ying1123
closed
1 year ago
0
Add reproduce scripts for seq len 256
#84
Ying1123
closed
1 year ago
0
added support for galactica-30b
#83
BinhangYuan
closed
1 year ago
0
Add more HELM examples
#82
Ying1123
closed
1 year ago
0
is cpu peak_mem monitoring work?
#81
dlfrnaos19
opened
1 year ago
2
CPU and M1/M2 GPU platform support
#80
xiezhq-hermann
opened
1 year ago
0
Add some scripts for ablation study
#79
Ying1123
closed
1 year ago
0
Update HELM XSUM text summarization benchmark
#78
Ying1123
closed
1 year ago
0
Update instructions for OPT-175B
#77
merrymercy
closed
1 year ago
0
Support Galactica
#76
2003pro
closed
1 year ago
1
AttributeError: 'OptLM' object has no attribute 'weight_home'
#75
shadowcz007
opened
1 year ago
9
AttributeError
#74
shadowcz007
closed
1 year ago
0
opt-175b model how to load model from disc.
#73
prof-schacht
opened
1 year ago
2
Add a HELM text summarization (xsum) example
#72
Ying1123
closed
1 year ago
0
CPU and M1/M2 GPU platform support
#71
xiezhq-hermann
closed
1 year ago
5
Move apps into flexgen package
#70
Ying1123
closed
1 year ago
0
Faster and memory-efficient weight download
#69
Ying1123
closed
1 year ago
0
Simplify API
#68
Ying1123
closed
1 year ago
0
Soft Label of Flexgen
#67
2003pro
opened
1 year ago
1
fix import error
#66
TakanoTaiga
closed
1 year ago
0
Issue with flexgen when running python script
#65
PsoriasiIR
closed
1 year ago
2
Context Length?
#64
Latrasis
closed
1 year ago
1
May i ask why the model is hard coded as "facebook/opt-30b"?
#63
AISuperMa
closed
1 year ago
1
[Apple M1 Max] TypeError: object.__new__() takes exactly one argument (the type to instantiate)
#62
certik
closed
1 year ago
1
Offloading to disk does not work (opt-66b)
#61
Paethon
closed
1 year ago
3
Is LLaMa supported?
#60
NightMachinery
opened
1 year ago
28
Create requirements.txt
#59
Bazla24
closed
1 year ago
1
[CI] Add unit tests
#58
Bazla24
closed
1 year ago
2
rtx3090 cublasGemmStridedBatchedExFix CUBLAS_STATUS_NOT_SUPPORTED
#57
ningpengtao-coder
closed
1 year ago
1
Update README
#56
merrymercy
closed
1 year ago
0
Update documentation to explicitly describe compatability/performance with early Pascal cards
#55
tensiondriven
opened
1 year ago
6
Release a pip installable package
#54
LukeLIN-web
closed
1 year ago
8
Explain API usage
#53
Ying1123
closed
1 year ago
0
Explain batch size
#52
Ying1123
closed
1 year ago
0
Update readme
#51
merrymercy
closed
1 year ago
0
Fix the output issue in benchmarking scripts
#50
zhuohan123
closed
1 year ago
0
FLAN-T5 XXL compatibility
#49
ekiwi111
opened
1 year ago
1
Example commands for OPT-30B and OPT-66B on machines with 32GB of system RAM and 24 GB of VRAM.
#48
Meatfucker
closed
1 year ago
1
RuntimeError: CUDA error: out of memory | WSL2 | RTX 3090 | OPT-6.7B
#47
ekiwi111
closed
1 year ago
1
Mac
#46
Karakor23
opened
1 year ago
4
Update README.md
#45
zhangce
closed
1 year ago
0
RuntimeError: CUDA error: out of memory | OPT-1.3b | RTX 3090
#44
ekiwi111
closed
1 year ago
4
Edit README
#43
DanFu09
closed
1 year ago
0
Pytorch-Lightning strategy
#42
k-sparrow
opened
1 year ago
2
Previous
Next