Reproducing the Results from the SUPERB Leaderboard

Hguimaraes commented 2 years ago

Hello Mr. Wang!

First of all, I would like to thank you for your work and effort to make it open source. I've been working on the robustness of SRL models and I'm trying to reproduce the downstream models from SUPERB.

Do you have the CKPT files generated when training the SUPERB models? If not, could you inform the parameters used in the config.yaml file from the tasks? With this, I could reproduce the numbers in the table.

Best regards, Heitor

mechanicalsea commented 2 years ago

Hello Heitor,

Thank you for your attention. We released the lighthubert checkpoints in https://huggingface.co/mechanicalsea/lighthubert, we can provide you with the configurations that are used to reproduce the lighthubert SUPERB's downstream models.

We followed the default config.yaml (e.g., doc) as SUPERB officially provided, and we list the batch size (bsz) and learning rate (lr) as follows.

ASR: bsz 32, lr 1e-4
ASV: bsz 50, lr 5e-5
ER: bsz 32, lr 1e-4
IC: bsz 32, lr 1e-4
KS: bsz 32, lr 1e-4
PR: bsz 32, lr 5e-4
QBE: bsz 32, lr 1e-4
SD: bsz 32, lr 5e-4
SF: bsz 32, lr 1e-4
SID: bsz 32, lr 5e-3 (small) / 5e-2 (base) / 1e-2 (stage1)

The s3prl added lighthubert as

https://github.com/s3prl/s3prl/blob/master/s3prl/upstream/lighthubert/expert.py

If you consider different architectures in three lighthubert checkpoints, here can be helpful:

https://github.com/s3prl/s3prl/issues/357

If you have any questions, don't hesitate to ask us.

Best wishes, Rui

Hguimaraes commented 2 years ago

Hi,

Thank you very much for your answer! Before opening this issue, I tried to reproduce the KS downstream model. I'm using the same batch size but tried with different learning rates:

lr 1e-3 ==> ACC = 0.9257
lr 1e-4 ==> ACC = 0.9192 (Same configuration as you reported)
lr 1e-5 ==> ACC = 0.9017

But all of them are far from the expected value of 0.9607 from the leaderboard. Do you have the original CKPT file?

I'm training the IC downstream task and I will try to reproduce the results and let you know if did achieve the values from SUPERB.

Best,

Hguimaraes commented 2 years ago

Hello,

For the IC with the passed parameters, I was able to get 96.94 (SUPERB Leaderboard says 98.23). Do you know what may be causing this difference? I looked into s3prl code and everything seems to use a seed, the result should be deterministic.

Best, Heitor

P1ping commented 2 years ago

Hi Heitor,

The performance degradation is mainly due to the lack of waveform normalization. LightHuBERT models are all trained with normalized waveform inputs, but the interface provided by SUPERB directly feeds the inputs to the pre-trained model. To temporarily fix this issue, you can add a line before https://github.com/s3prl/s3prl/blob/master/s3prl/upstream/lighthubert/expert.py#L49 as

        wavs = [F.layer_norm(wav, wav.shape) for wav in wavs]
        padded_wav = pad_sequence(wavs, batch_first=True)

Besides, we recommend you configure the subnet before https://github.com/s3prl/s3prl/blob/master/s3prl/upstream/lighthubert/expert.py#L17 like

        subnet = self.model.supernet.subnet
        self.model.set_sample_config(subnet)
        self.model.load_state_dict(checkpoint["model"], strict=False)

so that the subnet is correctly set; otherwise, a larger subnet would be chosen.

There is one more thing to notice. The SUPERB interface picks the first hidden state at a position different from our experiments (before or after the positional convolution), but this probably won't make much difference to the performance.

Thanks for your attention! We will upload the CKPT files soon. Please let us know if you have any other questions.

Best, \ Qibing

Hguimaraes commented 2 years ago

Thank you very much for your support and your work, Rui and Qibing! I will use those new configurations and try to close the performance gap.

If you guys want, we can close the issue.

Best, Heitor

mechanicalsea commented 2 years ago

Thanks for your attention to our lighthubert. If you have any questions about lighthubert, don't hesitate to ask us. Have a nice day.

Rui

Hguimaraes commented 2 years ago

Hi,

Just to let you know, with those changes it is possible to get closer results.

PR: 6.15
KS: 96.11
IC: 98.60
ER: 63.88

I'm closing the issue! Thank you very much for your help.

mechanicalsea / lighthubert

Reproducing the Results from the SUPERB Leaderboard #2