Closed syjabc closed 1 year ago
Well though the current codes are not suitable for rectangle input, i think you only need to change a few things to make it work.
First you can refer to /pretrain/README.md for customizing your audio dataset (and the corresponding data preprocessing code) and your CNN model.
Then you should make fmap_h
and fmap_w
different (in this line), to represent a rectangle shape, and change everything related to it, such as making input_size
in /pretrain/encoder.py and input_size
in /pretrain/utils/imagenet.py a tuple.
Then i think things will work.
Thanks for your thorough and patient responses.
Thank you for your excellent work. I want to do some work on audio classification task, and I found it is impossible to keep input shape as a square. What should i do when input tensor is a rectangle,like (128, 600).