Open Ching-Yee-Chan opened 1 month ago
Hi! Sorry for the delay in answering. Have you found anything?
The training was quite unstable, even in our case, so maybe the best hyperparams are not the ones that we put depending on your GPU configurations and batchsizes. I would suggest to try different LR and optimizers, and maybe even different seeds to see if the problem has a deeper root than just unluckiness.
I am trying to reproduce the training process from scratch on Voxpopuli-en. I preserved all your original hyperparameters but found that watermark-related losses stayed the same even after 30 epoches. Conversely, the g_loss went down quickly to negative. I noticed that since the watermark is added directly on the original waveform, there is a shortcut from the original audio to the watermarked audio. Therefore, the model will try to ignore the watermark and overfit g_loss if the gradient of watermark-related losses is considerably small. So I tried to:
Settings:
I ran
dora run solver=watermark/robustness dset=audio/voxpopuli
on a single 48GB GPU. Since I do not have access to any Slurm clusters, running dora hyperparameter search may not be feasible. All hyperparameters followsconfig/solver/watermark/default.yaml
except those mentioned above.Any of your insights or suggestions on this problem would be appreciated.