X-LANCE / SLAM-LLM

Speech, Language, Audio, Music Processing with Large Language Model
MIT License
511 stars 43 forks source link

LoRA weights and config are not generated when finetuning the model for AAC task with peft #103

Closed alifarrokh closed 3 months ago

alifarrokh commented 3 months ago

System Info

OS: Ubuntu 22.04.3 LTS Python Version: 3.10.12 PyTorch Version: 2.0.1 GPU: 1x Nvidia RTX A6000 CUDA Version: 12.4

Information

🐛 Describe the bug

I am trying to finetune the model for AAC task using LoRA. I use the official finetuning script examples/aac_audiocaps/scripts/finetune_eat_audiocaps.sh with slight modifications to the config params. The training and inference are ok when PEFT is disabled (i.e., finetuning the linear layer only). When I enable PEFT, however, LoRA weights (adapter_model.safetensors) and config (adapter_config.json) are not stored in the output directory.

Here is the output directory for both cases (with and without LoRA) after training is over:

output_dir
+--- .hydra
|    +--- config.yaml
|    +--- hydra.yaml
|    +--- overrides.yaml
+--- aac_epoch_x_step_y
|    +--- model.pt
+--- finetune_aac.log
+--- train.log

Here is the config:

hydra_args="
hydra.run.dir=$output_dir \
++model_config.llm_name=vicuna-7b-v1.5 \
++model_config.llm_path=$llm_path \
++model_config.llm_dim=4096 \
++model_config.encoder_fairseq_dir=$fairseq_eat_path \
++model_config.encoder_name=eat \
++model_config.encoder_ds_rate=2 \
++model_config.encoder_projector_ds_rate=$encoder_projector_ds_rate \
++model_config.encoder_path=$audio_encoder_path \
++model_config.encoder_dim=768 \
++model_config.encoder_projector=linear \
++dataset_config.encoder_projector_ds_rate=${encoder_projector_ds_rate} \
++dataset_config.dataset=audio_dataset \
++dataset_config.train_data_path=$train_jsonl_path \
++dataset_config.val_data_path=$val_jsonl_path \
++dataset_config.input_type=mel \
++dataset_config.fbank_mean=-4.268 \
++dataset_config.fbank_std=4.569 \
++dataset_config.model_name=eat \
++dataset_config.fixed_length=true \
++dataset_config.target_length=1024 \
++train_config.num_epochs=1 \
++train_config.model_name=aac \
++train_config.freeze_encoder=true \
++train_config.freeze_llm=true \
++train_config.batching_strategy=custom \
++train_config.warmup_steps=1000 \
++train_config.total_steps=100000 \
++train_config.lr=$lr \
++train_config.validation_interval=1 \
++train_config.batch_size_training=$btz \
++train_config.val_batch_size=$btz \
++train_config.num_workers_dataloader=4 \
++train_config.use_fp16=true \
++train_config.output_dir=$output_dir \
++train_config.seed=${seed} \
++train_config.use_peft=true \
++log_config.log_file="${output_dir}/train.log" \
++metric=acc \
"

As a side note, the training and validation accuracy are improved when enabling PEFT, suggesting that the training with PEFT is actually ok, but the learned weights are not stored on disk.

Error logs

Training is completed without any error, but LoRA weights and config files are not generated.

Expected behavior

The training script should generate the LoRA files along with model.pt.

ddlBoJack commented 3 months ago

The current code version integrates all the learnable weights into the model.pt checkpoint, including the LoRA weights.

alifarrokh commented 3 months ago

So there is no need to set peft_ckpt arg for inference, and setting train_config.use_peft=true is sufficient, right?

ddlBoJack commented 3 months ago

exactly

alifarrokh commented 3 months ago

Thank you very much.