Finetuning of BYOL-A - Githubissues

nttcslab / byol-a

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

Other

204 stars 35 forks source link

Hi @mschiwek, thank you for enjoying our paper!

Yes, I gave it some tries to finetune the BYOL-A model in an end-to-end fashion to the downstream tasks. But I didn't continue at that time. There's no technical problem with the finetuning. I just didn't proceed.

The primary reason why I didn't summarize all the results to make some nice tables is, I couldn't figure out what is the proper problem setting of the "finetuning." Previous papers (including supervised pretrainings) use each in different ways. Such as one uses data augmentations, another uses MLP as a head for downstream tasks, etc.

After opening the BYOL-A to the public, I'm working on the analysis of the BYOL-A, which includes finetuning. And we begin to understand some more things especially why ours work effectively?

I hope we can publish the next paper soon that would include our newer findings. And it hopefully answers more to your question.

nttcslab / byol-a

Finetuning of BYOL-A #4