kkoutini / PaSST

Efficient Training of Audio Transformers with Patchout
Apache License 2.0
285 stars 48 forks source link

Pretrained models config #25

Open dlthdus0611 opened 1 year ago

dlthdus0611 commented 1 year ago

Hi, How can I know configurations used for pre-training models? e.g. u_patchout, s_patchout_t, s_patchout_f etc...

Thank you!

kkoutini commented 1 year ago

Hi, Here is the config for the main pre-trained models:

passt-s-f128-p16-s10-ap.476.pt: {"embed": "default", "input_fdim": 128, "input_tdim": 998, "s_patchout_f": 4, "s_patchout_t": 40, "tstride": 10, "arch": "deit_base_distilled_patch16_384", "pretrained": true, "n_classes": 527, "in_channels": 1, "fstride": 10, "u_patchout": 0, "audioset_pretrain": false, "instance_cmd": "get_model"} 

passt-s-f128-p16-s10-ap.472.pt: {"embed": "default", "arch": "passt_deit_bd_p16_384", "pretrained": true, "n_classes": 527, "in_channels": 1, "fstride": 10, "tstride": 10, "input_fdim": 128, "input_tdim": 998, "u_patchout": 0, "s_patchout_t": 40, "s_patchout_f": 4, "instance_cmd": "get_model"} 

passt-s-f128-p16-s14-ap.469_swa471.pt: {"embed": "default", "arch": "passt_deit_bd_p16_384", "fstride": 14, "s_patchout_f": 3, "s_patchout_t": 30, "tstride": 14, "pretrained": true, "n_classes": 527, "in_channels": 1, "input_fdim": 128, "input_tdim": 998, "u_patchout": 0, "instance_cmd": "get_model"} 

passt-s-f128-p16-s16-ap.468_swa473: {"embed": "default", "fstride": 16, "input_fdim": 128, "input_tdim": 998, "s_patchout_f": 1, "s_patchout_t": 20, "tstride": 16, "arch": "deit_base_distilled_patch16_384", "pretrained": true, "n_classes": 527, "in_channels": 1, "u_patchout": 0, "audioset_pretrain": false, "instance_cmd": "get_model"} 

passt-s-f128-p16-s12-ap.470_swa473.pt: {"embed": "default", "arch": "passt_deit_bd_p16_384", "fstride": 12, "s_patchout_f": 3, "s_patchout_t": 40, "tstride": 12, "pretrained": true, "n_classes": 527, "in_channels": 1, "input_fdim": 128, "input_tdim": 998, "u_patchout": 0, "instance_cmd": "get_model"} 

If you need the configuration for other specific models, let me know.

dlthdus0611 commented 1 year ago

Hi, Thank you so much for replying.

I have another question. Is there any pre-training model with the architecture of PaSST-U or PaSST-B? I would like to know which model has configurations that 'u_patchout' is not zero (PaSST-U) or u_patchout', 's_patchout_t/f' (PaSST-B) are all zero. Because I want to compare them.

Thank you!

kkoutini commented 1 year ago

Hi, I trained two new models with no patchout and with unstructured patchout of 600 and uploaded them [here] (https://github.com/kkoutini/PaSST/releases/tag/v.0.0.7-audioset). I hope this helps