kkoutini / PaSST

Efficient Training of Audio Transformers with Patchout
Apache License 2.0
287 stars 48 forks source link

Changing the depth of PASST. #47

Open Rishabh-S1899 opened 3 months ago

Rishabh-S1899 commented 3 months ago

I want to change the depth of the transformer while finetuning the model. I am using the following command (inspired from ESC50) :

python3 ex_dcase.py with models.net.s_patchout_t=10 models.net.s_patchout_f=5 basedataset.fold=1 -p

I have already prepared the ex_dcase.py and dataset.py files for DCASE2020 dataset (inspired from ESC50 file provided by you). I have already been able to finetune the whole model once. Now I want to add a depth parameter to the commandline to run finetune script, so that I can control how many block I want to finetune on the architecture. Currently I change the depth by changing the depth variable of the desired architecture here. Suggest the required changes I need to make so that I can execute a command in the commandline and only finetune selective layers.

kkoutini commented 3 months ago

There are two options, you can either make a new function similar to passt_s_kd_p16_128_ap486 and make the depth a parameter to the function and add it as a parameter to get_model method. In that case, you can change it directly from the command line usind sacred syntax:

python ex_audioset.py with models.net.depth=x -p

However, if you are loading a pretrained model this may break the loading, since now the model now have different weights. I recommend using lighten_model method (as used in passt-L) which will:

example:

python ex_audioset.py with models.net.depth=x models.net.cut_depth=-2 -p

will remove every other layer from the pre-trained model to reduce the depth