issues
search
Nota-NetsPresso
/
shortened-llm
Compressed LLMs for Efficient Text Generation [ICLR'24 Workshop]
63
stars
8
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
path to use for --resume_from_checkpoint
#19
sriyachakravarthy
opened
1 week ago
0
Using shortened-llm for Instruct models
#18
sriyachakravarthy
opened
1 week ago
2
loading using AutoModelforCausalLm.frompretrained method
#17
sriyachakravarthy
opened
2 weeks ago
0
Can we extend it to instruct models ?
#16
ambuje
closed
2 weeks ago
4
Reduce C4 load time with subset from Nota's S3
#15
bokyeong1015
opened
1 month ago
0
CUDA out of memory, when run taylor with llama3-8b
#14
yaolu-zjut
opened
3 months ago
0
Reproducing paper results
#13
qe660212
closed
3 months ago
1
How to add additional block analysis data for other models
#12
botox-100
closed
3 months ago
1
Support GPTQ
#11
lifelongeeek
closed
3 months ago
1
Add arXiv-v2 contents
#10
bokyeong1015
closed
4 months ago
0
Support LLaMA-2
#9
bokyeong1015
closed
4 months ago
0
How can this method be applied to other large language models?
#8
Franklin-L
closed
4 months ago
1
Support LLaMA-3
#7
bokyeong1015
closed
5 months ago
0
Add scripts for Gemma
#6
bokyeong1015
closed
6 months ago
0
Refactor data-related code & Add BOS-token option
#5
bokyeong1015
closed
6 months ago
2
Apply ruff
#4
bokyeong1015
closed
6 months ago
0
Fix evaluation script
#3
bokyeong1015
closed
6 months ago
0
Fix evaluation script
#2
bokyeong1015
closed
6 months ago
0
Add functions for pruning and retraining
#1
bokyeong1015
closed
7 months ago
0