FMInference FlexLLMGen issues

FMInference / FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.

Apache License 2.0

9.18k stars 548 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

FlexGen for Data wrangle Tasks.

#91 BinhangYuan closed 1 year ago
0
M1/M2 Mac (MPS) support +

#90 con3code closed 1 year ago
0
Updates on README.md

#89 zhangce closed 1 year ago
0
Update README.md

#88 nicholasachow closed 1 year ago
0
Where is the chatbot? I miss it!

#87 fuhengwu2021 closed 1 year ago
7
Soft lockup after running flex_opt

#86 zhang677 opened 1 year ago
1
Update readme

#85 Ying1123 closed 1 year ago
0
Add reproduce scripts for seq len 256

#84 Ying1123 closed 1 year ago
0
added support for galactica-30b

#83 BinhangYuan closed 1 year ago
0
Add more HELM examples

#82 Ying1123 closed 1 year ago
0
is cpu peak_mem monitoring work?

#81 dlfrnaos19 opened 1 year ago
2
CPU and M1/M2 GPU platform support

#80 xiezhq-hermann opened 1 year ago
0
Add some scripts for ablation study

#79 Ying1123 closed 1 year ago
0
Update HELM XSUM text summarization benchmark

#78 Ying1123 closed 1 year ago
0
Update instructions for OPT-175B

#77 merrymercy closed 1 year ago
0
Support Galactica

#76 2003pro closed 1 year ago
1
AttributeError: 'OptLM' object has no attribute 'weight_home'

#75 shadowcz007 opened 1 year ago
9
AttributeError

#74 shadowcz007 closed 1 year ago
0
opt-175b model how to load model from disc.

#73 prof-schacht opened 1 year ago
2
Add a HELM text summarization (xsum) example

#72 Ying1123 closed 1 year ago
0
CPU and M1/M2 GPU platform support

#71 xiezhq-hermann closed 1 year ago
5
Move apps into flexgen package

#70 Ying1123 closed 1 year ago
0
Faster and memory-efficient weight download

#69 Ying1123 closed 1 year ago
0
Simplify API

#68 Ying1123 closed 1 year ago
0
Soft Label of Flexgen

#67 2003pro opened 1 year ago
1
fix import error

#66 TakanoTaiga closed 1 year ago
0
Issue with flexgen when running python script

#65 PsoriasiIR closed 1 year ago
2
Context Length?

#64 Latrasis closed 1 year ago
1
May i ask why the model is hard coded as "facebook/opt-30b"?

#63 AISuperMa closed 1 year ago
1
[Apple M1 Max] TypeError: object.__new__() takes exactly one argument (the type to instantiate)

#62 certik closed 1 year ago
1
Offloading to disk does not work (opt-66b)

#61 Paethon closed 1 year ago
3
Is LLaMa supported?

#60 NightMachinery opened 1 year ago
28
Create requirements.txt

#59 Bazla24 closed 1 year ago
1
[CI] Add unit tests

#58 Bazla24 closed 1 year ago
2
rtx3090 cublasGemmStridedBatchedExFix CUBLAS_STATUS_NOT_SUPPORTED

#57 ningpengtao-coder closed 1 year ago
1
Update README

#56 merrymercy closed 1 year ago
0
Update documentation to explicitly describe compatability/performance with early Pascal cards

#55 tensiondriven opened 1 year ago
6
Release a pip installable package

#54 LukeLIN-web closed 1 year ago
8
Explain API usage

#53 Ying1123 closed 1 year ago
0
Explain batch size

#52 Ying1123 closed 1 year ago
0
Update readme

#51 merrymercy closed 1 year ago
0
Fix the output issue in benchmarking scripts

#50 zhuohan123 closed 1 year ago
0
FLAN-T5 XXL compatibility

#49 ekiwi111 opened 1 year ago
1
Example commands for OPT-30B and OPT-66B on machines with 32GB of system RAM and 24 GB of VRAM.

#48 Meatfucker closed 1 year ago
1
RuntimeError: CUDA error: out of memory | WSL2 | RTX 3090 | OPT-6.7B

#47 ekiwi111 closed 1 year ago
1
Mac

#46 Karakor23 opened 1 year ago
4
Update README.md

#45 zhangce closed 1 year ago
0
RuntimeError: CUDA error: out of memory | OPT-1.3b | RTX 3090

#44 ekiwi111 closed 1 year ago
4
Edit README

#43 DanFu09 closed 1 year ago
0
Pytorch-Lightning strategy

#42 k-sparrow opened 1 year ago
2

Previous Next