Open jerpint opened 9 months ago
Please see https://github.com/LAION-AI/CLAP?tab=readme-ov-file#dataset-format for details on the dataset format that we trained on. I think you can refer to the training script for fine-tuning, but remember to modify the learning rate and weight initialization.
Hello!
Suppose I have a dataset of {audio, text} pairs. I would now like to finetune CLAP on this audio subset. Do you have any tips for getting started with such a task? Would continuing the training from a checkpoint with a smaller learning rate be somewhat of a good start? Do you have scripts that allow to do something similar?
Thanks