issues
search
pytorch-labs
/
gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
BSD 3-Clause "New" or "Revised" License
5.67k
stars
514
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Has anyone run this code with bs>1 and speculatively?
#214
deafTim
opened
4 weeks ago
0
Mistake in 191 line if is_speculative=True generate.py ?
#213
deafTim
opened
1 month ago
1
[wip] entropy specdec
#212
stillmatic
closed
1 month ago
0
Error with meta-llama/Llama-3.2-1B
#211
deafTim
opened
1 month ago
2
Request for Smaller Model Options (~1B Parameters)
#210
deafTim
opened
1 month ago
0
Error with stories15M and stories110M
#209
deafTim
opened
1 month ago
8
Adding torchao apis to gpt-fast
#208
HDCharles
opened
1 month ago
2
The Actual Throughput of int8 Quantization is Significantly Lower than Baseline on A100
#207
crhcrhcrhcrh
closed
1 month ago
4
Support generation tasks for eval.py
#206
mostafaelhoushi
opened
1 month ago
0
add huggingface-hub
#205
kunschg
opened
1 month ago
0
update README.md
#204
kunschg
opened
1 month ago
0
feat: add Llama-3.2-[1B/3B] support
#203
stillmatic
opened
1 month ago
2
Fix Llama3 HF checkpoint converter
#202
yanboliang
closed
2 months ago
0
Ensemble
#201
mayank31398
closed
2 months ago
1
Add support for llama 3.1 8B/70B
#200
yanboliang
closed
2 months ago
0
Support Llama-3.1-405B
#199
yanboliang
closed
2 months ago
0
Reasons for the poor effect of Speculative Sampling
#198
JoeNan1
opened
2 months ago
1
Fix docstring args names
#197
kit1980
closed
3 months ago
0
Integrate Flex Decoding
#196
BoyuanFeng
opened
3 months ago
2
Add Phi-3-mini-4k-instruct bfloat16/int8
#195
makaveli10
opened
3 months ago
2
Activation quantization support
#194
ayyoobimani
opened
3 months ago
1
int4 quantization cpu fix
#193
likholat
opened
3 months ago
3
flex_attention ver.
#192
joydddd
opened
3 months ago
2
Update sdpa function with enable_gqa=True
#191
jainapurva
opened
4 months ago
1
permute function in `convert_hf_checkpoint.py`
#190
Sohaib9920
closed
4 months ago
3
trying to convert huggingface whisper model to pytorch
#189
nullonesix
opened
4 months ago
1
Support of FlashDecoding
#188
jianc99
closed
4 months ago
3
Decouple int4 weight with serialized format
#187
yanbing-j
opened
4 months ago
5
tokenizer.model
#186
hasakikiki
opened
4 months ago
1
It doesn't accelerate very well at L4
#185
songh11
opened
5 months ago
1
getting different acceptance prob when using `torch.compile` after making a small change.
#184
kalradivyanshu
opened
5 months ago
0
Question about the ENABLE_INTRA_NODE_COMM for speculative decoding
#183
jianc99
closed
4 months ago
9
GGUF support?
#182
yukiarimo
opened
5 months ago
0
Fix rope base issue with llama 3
#181
VikParuchuri
closed
5 months ago
3
[WIP] Use DTensor-based tensor parallel
#180
kwen2501
opened
5 months ago
0
`meta-llama/Meta-Llama-3-8B-Instruct` generates gibberish for long prompts
#179
griff4692
closed
5 months ago
5
Update installation instructions in README.md
#178
Jokeren
closed
5 months ago
1
Hard-coded Llama-3 model name pattern matching breaks scripts/convert_hf_checkpoint.py
#177
ephremw
closed
2 months ago
0
Update Grok-1 and DBRX support in README
#176
yanboliang
closed
6 months ago
0
Remove nn.Embedding layer from model size
#175
yanboliang
closed
6 months ago
0
[example] Add support for DBRX
#174
yanboliang
opened
6 months ago
0
Throughput Benchmark Scripts
#173
HanGuo97
closed
6 months ago
2
Missing Keys in state_dict
#172
bjohn22
opened
6 months ago
2
[example] Added (hacky) Grok1 support
#171
Chillee
opened
6 months ago
2
Making TokenizerInterface more usable for the user's code.
#170
Artyom17
opened
6 months ago
0
Unified Llama 3 (8b,70b) + Safetensors support
#169
nivibilla
closed
5 months ago
20
Unified llama 3 support.
#168
nivibilla
closed
6 months ago
1
Tensor Parallel Inside notebook
#167
nivibilla
opened
6 months ago
3
Llama3 8b perf numbers on A100
#166
yanboliang
closed
5 months ago
0
mmap issue in bf16 of gpt-fast
#165
yanbing-j
opened
6 months ago
1
Next