issues
search
princeton-nlp
/
MeZO
[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333
MIT License
1.02k
stars
60
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
question about MeZO-adam
#37
zhaoaustin
opened
3 weeks ago
1
Can you share the dataset class of SST-5, SNLI, TREC datasets?
#36
Ziiiirem
opened
4 weeks ago
5
roberta-large zero shot
#35
itongggg
opened
1 month ago
0
can not reproduce the the result of roberta large on dataste sst-2
#34
itongggg
opened
2 months ago
2
Maybe need a requirement.txt file to facilitate environment preparation?
#33
lepangdan
opened
3 months ago
1
In which file is the code implemented by the algorithm?
#32
1llss
opened
5 months ago
1
Zero Order implementation does not converge in CIFAR-10 dataset.
#31
amritansh6
opened
6 months ago
1
Standard FT does not work
#30
YaNgZhAnG-V5
opened
7 months ago
3
max_seq_length and max_seq_len confusion
#29
davidqqq
opened
8 months ago
1
Cannot reproduce some results of OPT
#28
WangFei-2019
closed
3 weeks ago
3
How to use MeZO in training a simple CIFAR-10 model
#27
Cascol-Chen
opened
8 months ago
3
Add a pip-installable, simple implementation of MeZO (along with a distributed impl. and some tests)
#26
lebrice
opened
8 months ago
3
Results of Trec dataset on Roberta-large(K=512) with MeZO(LoRA)
#25
Yanjun-Zhao
opened
8 months ago
8
Inconsistent results of MEZO for RoBERTa-large on SST-2
#24
han678
opened
10 months ago
0
MeZO on ChatGLM6B
#23
CharonsPluto
closed
4 months ago
2
LoRA & p-tuning with multi-GPU
#22
haozhouamzn
opened
10 months ago
3
Cannot reproduce the results for RoBERTa on SST-2
#21
TrueNobility303
opened
1 year ago
1
llama2 problem
#20
ghost
opened
1 year ago
1
ValueError: The model did not return a loss from the inputs, only the following keys: logits,past_key_values. For reference, the inputs it received are input_ids,attention_mask.
#19
thistleknot
closed
1 year ago
2
AttributeError: 'TrainingArguments' object has no attribute 'linear_probing'
#18
thistleknot
closed
1 year ago
4
Nanogpt implementation
#17
thistleknot
opened
1 year ago
3
Cannot reproduce the results of OPT on SST2
#16
sglucas
closed
1 year ago
15
Results on WSC and WIC datasets cannot be reproduced on OPT-13B with MeZO
#15
MathIsAll
opened
1 year ago
5
About experimentical setting of 1000 examples
#14
sglucas
closed
1 year ago
2
MeZO on continue pre-training
#13
shan23chen
opened
1 year ago
1
deepspeed reference on colab
#12
huu4ontocord
closed
1 year ago
2
Getting a RuntimeError after training with mezo
#11
sowmaster
opened
1 year ago
6
Which trainer to use
#10
HaniItani
opened
1 year ago
7
MeZO running script for roberta-large is not working
#9
sanyalsunny111
closed
1 year ago
1
gpt_neo not supported
#8
thistleknot
closed
1 year ago
8
Best parameters found for datasets
#7
vvvm23
opened
1 year ago
3
Not convergent in custom dataset.
#6
jcao-ai
opened
1 year ago
9
Can you provide more details about how to run the code?
#5
kiseliu
closed
1 year ago
1
MeZo can be used in NLG tasks?
#4
anonNo2
opened
1 year ago
5
Fix typo in run.py
#3
eltociear
closed
1 year ago
0
Impact of Dropout?
#2
helpmefindaname
closed
1 year ago
1
Any benchmark on (MeZO) v.s. (ZeRO + CpuOffload + Grad checkpointing) ?
#1
xingchensong
closed
1 year ago
2