mikeybellissimo / LoRA-MPT

A repo for finetuning MPT using LoRA. It is currently configured to work with the Alpaca dataset from Stanford but can easily be adapted to use another.
Apache License 2.0
18 stars 7 forks source link

lora model size is always ~400bytes #7

Open erlakshmi123 opened 1 year ago

erlakshmi123 commented 1 year ago

I have tried multiple attempts with this I always get the adapter_model.bin as 400 bytes.. looks like its not training or saving the model. train data is small ~2000 json lines in dolly prompt, response format.

python src/finetune.py \ --base_model 'mosaicml/mpt-7b-instruct' \ --data_path 'dataset/train_data.json' \ --output_dir './lora-mpt' \ --batch_size 256 \ --micro_batch_size 4 \ --num_epochs 100 \ --learning_rate 3e-5 \ --cutoff_len 1024 \ --val_set_size 200 \ --lora_r 4 \ --lora_alpha 8 \ --lora_dropout 0.05 \ --lora_target_modules '[Wqkv]' \ --train_on_inputs False \ --group_by_length False \ --use_gradient_checkpointing True \ --load_in_8bit False \ --needs_prompt_generation False

mikeybellissimo commented 1 year ago

Hi! Have you been getting any errors/warning alongside this?

erlakshmi123 commented 1 year ago

I have not see any errors.

madaracelio commented 1 year ago

Hi ! @erlakshmi123 Try to use a small value for the batch_size (8, 16, 32) and epoch (8, 10) if u have not enough dataset. It should work.