File "../miniconda3/envs/mae/lib/python3.9/site-packages/timm/models/swin_transformer.py", line 330, in _shifted_window_attn
x = x.view(B, H, W, C)
RuntimeError: shape '[16, 64, 8, 512]' is invalid for input of size 524288
I modify this line of code:
line 180 ( main_pretrain.py )
target_length = {'audioset':1024, 'esc50':512, 'speechcommands':128}
If I change " 'speechcommands':128 " to 1024, it runs smoothly,
but I want to execute it with 128.
Could you please help me understand where I went wrong? Thank you!
Self-Answer:
In models_mae.py and models_vit.py, modify the 'unpatchify' (in models_mae) and 'random_masking_2d' (in both models_mae & models_vit) to parameters suitable for SPC.
Hello, I would like to use the speechcommands dataset for pretraining, but I have encounter an error.
The command is as follows:
python main_pretrain.py \ --dataset='speechcommand' \ --data_train='./speechcommand_train_data.json' \ --data_eval='./datafiles/speechcommand_eval_data.json' \ --label_csv='./speechcommands_class_labels_indices.csv' \
The following error occurs:
File "../miniconda3/envs/mae/lib/python3.9/site-packages/timm/models/swin_transformer.py", line 330, in _shifted_window_attn x = x.view(B, H, W, C) RuntimeError: shape '[16, 64, 8, 512]' is invalid for input of size 524288
I modify this line of code: line 180 ( main_pretrain.py ) target_length = {'audioset':1024, 'esc50':512, 'speechcommands':128}
If I change " 'speechcommands':128 " to 1024, it runs smoothly, but I want to execute it with 128.
Could you please help me understand where I went wrong? Thank you!