v-iashin / SpecVQGAN

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
https://v-iashin.github.io/SpecVQGAN
MIT License
347 stars 40 forks source link

Number of different features #31

Closed Ivvvvvvvvvvy closed 1 year ago

Ivvvvvvvvvvy commented 1 year ago

Hello, By reading your paper, I know how 212 feats are obtained. But it is not clear about 1 feat and 5 feats. May I ask how the 1 feat and 5 feats of the resnet50 model mentioned in your paper were obtained?

ca4faf39780c9240543701c59c14e80

v-iashin commented 1 year ago

here you go

https://github.com/v-iashin/SpecVQGAN/blob/8ab6981535ab70fad3531688e0f630f1ce3b834f/specvqgan/data/vggsound.py#L55-L76

Ivvvvvvvvvvy commented 1 year ago

here you go

https://github.com/v-iashin/SpecVQGAN/blob/8ab6981535ab70fad3531688e0f630f1ce3b834f/specvqgan/data/vggsound.py#L55-L76

Thank you very much for your serious answer every time!