NVIDIA / audio-flamingo

PyTorch implementation of Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.
MIT License
168 stars 9 forks source link

sharing weights of the pretrained stage model #13

Open jasonppy opened 1 month ago

jasonppy commented 1 month ago

Hi Zhifeng,

Since there are a few datasets that I cannot obtain, is it possible to share the weights of the model after the pretraining stage?

Also this is the training loss of pretraining stage (quite high variance):

image

and this is the training loss of the SFT stage

image

Do them look right to you?

My reproduced results are a bit far from reported results. And sharing the weights of the model after the pretraining stage will great help me narrow down the issue.

Appreciate your time and effort!

Puyuan

zhifengkongnv commented 1 month ago

We could not opensource the additional model checkpoint due to review policy.

The training loss (SFT) looks fine to me - if you set smoothing=0.6 then it oscillates between 0.5 and 1.5.

Screenshot 2024-08-13 at 2 46 31 PM

The validation loss is more informative. For example, the valid_losses of 3 different validation sets below indicate the model learns well on these tasks.

Screenshot 2024-08-13 at 2 50 14 PM