rxtan2 / AVSeT

16 stars 4 forks source link

Instructions for using the pretrained model #4

Open ShubhamPandey28 opened 1 year ago

ShubhamPandey28 commented 1 year ago

I am having difficulty in loading weights from music_sound.pth. It only contains the audionet layers. It would be very convenient if someone could share instructions in readme.

ShubhamPandey28 commented 11 months ago

@rxtan2

thirteen-bears commented 2 months ago

I think that the visual model part is the pretrained ResNet50-CLIP.