hche11 / Localizing-Visual-Sounds-the-Hard-Way

Localizing Visual Sounds the Hard Way
Apache License 2.0
72 stars 14 forks source link

Spectrogram Dimension #13

Closed cyh-0 closed 1 year ago

cyh-0 commented 1 year ago

Hi,

I am wondering how could we get 257x300 tensor for spectrograms. I only got 257x200 for 16000 sample rate and 3s audio