hche11 / VGGSound

VGGSound: A Large-scale Audio-Visual Dataset
http://www.robots.ox.ac.uk/~vgg/data/vggsound/
Other
285 stars 31 forks source link

[Question] Extract ResNet audio embedding layer #4

Closed loretoparisi closed 4 years ago

loretoparisi commented 4 years ago

I would like to extract the "embedding" layer of the VGG network implemented in models. By example, in the case of for resnet-18 for images, I would take the avgpool like

model = models.resnet18(pretrained=True)
layer = model._modules.get('avgpool')
self.layer_output_size = 512

Is that correct for VGGSound?

Thank you.