ZebangCheng / Emotion-LLaMA

Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning
BSD 3-Clause "New" or "Revised" License
96 stars 9 forks source link

About 2-stage training #21

Open Archie1121 opened 5 hours ago

Archie1121 commented 5 hours ago

Hello, author! In your paper, you mentioned that you used a two-stage training strategy, consisting of coarse and fine-grained JSON files. Regarding the code, I want to ask whether I should replace the "ann_path" in featureface.yaml for training. If I replace the coarse-grained JSON path with the fine-grained JSON path, how can the .pth file from stage 1 load successfully?

ZebangCheng commented 4 hours ago

If you executed the Stage 1 pre-training code, the model parameter files will be saved in the checkpoints/save_checkpoint directory. Use the model parameter file from the last epoch and set the path in the configuration file:

# Set Emotion-LLaMA path
ckpt: "/home/czb/project/Emotion-LLaMA/checkpoints/save_checkpoint/2024xxxx-v1/checkpoint_29.pth"

In simple terms, it means replacing the original minigptv2_checkpoint.pth weights with the weights from the first stage of training.

Archie1121 commented 3 hours ago

Thans a lot!By the way, in your paper, you mentioned that you fine-tuned on the DFEW dataset. For this fine-tuning, did you continue training with the emotion-llama model obtained from the first stage, the emotion-llama model from the second stage, or did you re-train it as you did with the MERR dataset? Additionally, for the DFEW dataset, besides extracting the corresponding features, do you also need to calculate and identify AUs to obtain coarse-grained or fine-grained labels?