the train_clm.py file contains two similar main functions

BaohaoLiao commented 9 months ago

as the title

BaohaoLiao commented 9 months ago

In addition, I deleted the second part of train_clm.py and tried to reproduce your result on wikitext with the offered script:

python train_clm.py \
--model_name_or_path LoftQ/Llama-2-7b-hf-bit4-rank64 \
--output_dir exp_results/wikitext-2/bit4-rank64_ft \
--learning_rate 1e-4  \
--seed 888 \
--dataset_name wikitext \
--dataset_config wikitext-2-raw-v1 \
--num_train_epochs 2 \
--per_device_train_batch_size 1 \
--per_device_eval_batch_size 1 \
--gradient_accumulation_steps 8 \
--do_train \
--do_eval \
--logging_steps 50 \
--evaluation_strategy epoch \
--report_to tensorboard \
--overwrite_output_dir \
--block_size 1024 \

The results are: {'eval_loss': 2.330519199371338, 'eval_accuracy': 0.5374923472925482, 'eval_runtime': 266.2998, 'eval_samples_per_second': 1.085, 'eval_steps_per_second': 1.085, 'epoch': 1.0} {'eval_loss': 2.464402675628662, 'eval_accuracy': 0.5212330921673483, 'eval_runtime': 266.3019, 'eval_samples_per_second': 1.085, 'eval_steps_per_second': 1.085, 'epoch': 2.0}

The perplexity is 11.76, far away from the reported 5.24.

yxli2123 commented 9 months ago

Hi, it seems due to the incorrect model and the batch size. Please try LoftQ/Llama-2-7b-hf-4bit-64rank and total batch size equal to 64. Meanwhile, I have uploaded the new version training script

BaohaoLiao commented 9 months ago

I can reproduce the perplexity of 5.60 now. Though it's slightly higher than the reported 5.24, I think it is acceptable.

yxli2123 / LoftQ

the train_clm.py file contains two similar main functions #11