v-iashin / SpecVQGAN

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
https://v-iashin.github.io/SpecVQGAN
MIT License
347 stars 40 forks source link

Overfitting occurs when training transformer #44

Closed Ivvvvvvvvvvy closed 8 months ago

Ivvvvvvvvvvy commented 8 months ago

Hello, recently I am training specvqgan with a small data set (only 2000 10s audio and video pairs), and when I use the pre-trained model for fine-tuning, the codebook results are very good. But when I used the vas pre-trained model to fine-tune the transformer, serious overfitting occurred. val/loss began to rise after falling to 2.71. When I changed the parameter dropout in first_stage_permuter_config (tried 0.3 and 0.6), It has no impact on the val/loss of the model. May I ask which parameters in the transformer.yaml file should I modify to alleviate overfitting?

v-iashin commented 8 months ago

hi. i don't have any specific advice for you here. you can study the ways that are usually used to combat overfitting in machine learning.

at the same time, i don't think using such high values of dropout would help.

the transformer is gpt-2 (300M) which is a lot of parameters for your dataset, tbh. maybe you can try smaller variants of gpt-2

v-iashin commented 8 months ago

in my experience, the loss overfits quickly in the second stage. so it didn't come as a surprise to me