YuanGongND / ssast

Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".
BSD 3-Clause "New" or "Revised" License
365 stars 61 forks source link

Bug with input_dim and pure inference #17

Open fanOfJava opened 1 year ago

fanOfJava commented 1 year ago

在finetune阶段,无论run.sh阶段设置的input_dtim是多少,最终都会是1024.

YuanGongND commented 1 year ago

Hi,

Can you elaborate on which argument you are referring to, is that https://github.com/YuanGongND/ssast/blob/a1a3eecb94731e226308a6812f2fbf268d789caf/src/finetune/esc50/run_esc_patch.sh#L41

Thanks!

-Yuan

fanOfJava commented 1 year ago

yes

YuanGongND commented 1 year ago

Can you explain why the value would be 1024?

It seems to me that it changes

https://github.com/YuanGongND/ssast/blob/a1a3eecb94731e226308a6812f2fbf268d789caf/src/run.py#L97-L101

and

https://github.com/YuanGongND/ssast/blob/a1a3eecb94731e226308a6812f2fbf268d789caf/src/run.py#L132-L138

for both dataloading and model instantiation.

fanOfJava commented 1 year ago

because the process of loading the model file ssast-base-patch-400.pth changes the target_length, the code is shown as below try: p_fshape, p_tshape = sd['module.v.patch_embed.proj.weight'].shape[2],sd['module.v.patch_embed.proj.weight'].shape[3] p_input_fdim, p_input_tdim = sd['module.p_input_fdim'].item(), sd['module.p_input_tdim'].item()

fanOfJava commented 1 year ago

我猜测这也是为什么finetune完之后,做纯推理时load model file会报错的原因。不知道我理解的是否对

YuanGongND commented 1 year ago

can you paste the error code here?

fanOfJava commented 1 year ago

can you paste the error code here?

you can print the p_input_tdim before 156 line of ast_model,you will find the error

YuanGongND commented 1 year ago

I don't have enough time to run it again. The code is a cleaned up version from the development version. It went through a brief test and I guess I did take care of this. So if you already have a error message, that would be very helpful. It might due to something else.

fanOfJava commented 1 year ago

我相信很多人都有同样的问题。因为finetune之后保存的模型,根本没法load进来做纯推理,我也不知道该如何测试训练好的模型的真实性能

YuanGongND commented 1 year ago

Oh I see, yes, that is a known problem. It should be fine if you finetune a pretrained model that has different target_length, but if you want to take the finetuned model for deployment, you will get an error.

For checking the performance, once you finetune a pretrained model, the script will print out the accuracy (or mAP) and also save the result on disk.

For deploy the model for inference, you will need to fix the bug.

YuanGongND commented 1 year ago

Can you check this: https://github.com/YuanGongND/ssast/issues/4