added SSL benchmark and tests

salah-zaiem commented 1 year ago

Added here the SSL Benchmark and the tests, another PR is needed on the speechbrain library to add the WeightedSSLModel class.

mravanelli commented 1 year ago

Thank you for this PR, @salah-zaiem! You can find my comments below.

I believe the README file needs improvement. As a reference, you can use the CL_MASR README as a template (see here). Here are the necessary updates that I propose:

Provide a brief description of the benchmark, explaining why it is needed and what is implemented. You can include a link to your paper and encourage readers to cite it.
Mention the datasets used for this benchmark, providing download links. It might be helpful to present the datasets in a table format due to the large number available.
In the Quickstart section, provide clear instructions on how to run both a single and bunch experiment with runnable code snippets. For example, explain how to run a single experiment using the command: python train.py hparams.yaml. Additionally, explain how to modify the SSL model by changing ssl_hub and guide users on assessing their own (local) SSL models. Finally, include instructions on running multiple experiments with run_benchmark.sh and provide runnable code snippets.
Include a Results section with a table displaying the performance numbers for all the tasks and models used. Make sure to add links to the output folders of each model (you can upload them on the Speechbrain dropout). I can grant you access to the Speechbrain Dropbox when you are ready.
Mention the computational resources required to run the experiments. You can refer to WAVLM and include a table indicating the time and GPU memory needed for each experiment.
Finally, add the reference to the SSL_Benchmark paper at the end but before the reference to the Speechbrain paper.

Moreover, a few additional comments:

I have merged your PR on the Speechbrain repository, specifically on the develop branch, for the weighted SSL model.
Regarding the different data folders for each dataset (indicated by !PLACEHOLDER in the YAML files), I am curious how you handle that in run_benchmark.sh. I would have expected users to specify the data folder for each dataset.
As of now, I have only tested the librispeech LSTM recipe. However, when I ran it using the command:

python train.py hparams/ssl.yaml --data_folder=$SLURM_TMPDIR/librispeech

I encountered the following error:

ValueError: 'ngram_lm_path' is a !PLACEHOLDER and must be replaced.

It also appears that ngram_lm_path is not used. If it is required in other recipes, I believe we should automatically download it from the Speechbrain Dropbox during data preparation.

Apart from documentation, I believe we should put effort into making the benchmark as user-friendly as possible. Our goal is for users to be able to run it without encountering too many issues. @TParcollet, any idea or comment?

mravanelli commented 1 year ago

thank you @salah-zaiem for addressing my comments! I made some minor modifications, mainly on the readme file. have a couple of final suggestions:

We need to assign a meaningful name to the benchmark. This is crucial. As a suggestion, I propose the name "MPSS" (Multi-Probe Self-Supervised Benchmark). You might have better ideas. Please take some time to think about it.
While testing the LSTM recipe on Librispeech, I encountered a warning:

/scratch/ravanelm/myenv1/lib/python3.10/site-packages/transformers/configuration_utils.py:380: UserWarning: Passing `gradient_checkpointing` to a config initialization is deprecated and will be removed in v5 Transformers. Using `model.gradient_checkpointing_enable()` instead, or if you are using the `Trainer` API, pass `gradient_checkpointing=True` in your `TrainingArguments. `

I would recommend investigating this warning further. It seems that all the recipes might be affected by this issue in future versions of transformers.

mravanelli commented 1 year ago

thank you @salah-zaiem! Everything looks good to me!

speechbrain / benchmarks

added SSL benchmark and tests #3