qiuqiangkong / panns_transfer_to_gtzan

100 stars 39 forks source link

problem with audio tagging when use inference code #6

Open aliakbartaghizadeh opened 3 years ago

aliakbartaghizadeh commented 3 years ago

Hello Thanks for sharing codes I appreciate your consideration In this readme file, you haven't written anything about how we can inference a new file and tag it with the transfer_cnn14 model. so I decide to use the way that you wrote in the audioset_tagging_cnn-master readme file but, I get this error that shows in the below screenshot. How can I solve this problem? Do you have any special inference code for transfer learning? (if yes can you upload it please.) Screenshot from 2020-12-04 16-45-22

AntyRia commented 1 year ago

I also encountered the same problem, have you solved it

baicaigithub commented 10 months ago

I solved this with following steps:

  1. Copy the inference.py from the other repo to this panns_transfer_to_gtzan
  2. Use model type Transfer_Cnn14
  3. Add freeze_base=True when declare the model.
  4. Run the inference.py

I found the probability is negative. And the model in this repo does not have framewise_output so it does not work for the sound_event_detection feature. But I think this kaggle notebook will have hint.

Example for blues.00005.wav:

GPU number: 1
blues: -0.154
rock: -2.781
country: -3.707
hiphop: -4.217
jazz: -4.519
reggae: -4.669
metal: -4.958
pop: -5.078
disco: -5.450
classical: -5.516
embedding: (2048,)