tyiannak / deep_audio_features

Pytorch implementation of deep audio embedding calculation
MIT License
98 stars 9 forks source link

Bug in classification report? #49

Open tyiannak opened 2 years ago

tyiannak commented 2 years ago

Theres a bug related to path dept in classification report. To reproduce:

from deep_audio_features.bin import classification_report as cr
cr.test_report('/Users/tyiannak/Downloads/soundscape_8k_1s.pt', ['/Users/tyiannak/Downloads/soundscape_8k_1sec/test/1', '/Users/tyiannak/Downloads/soundscape_8k_1sec/test/2/', '/Users/tyiannak/Downloads/soundscape_8k_1sec/te
   ...: st/3', '/Users/tyiannak/Downloads/soundscape_8k_1sec/test/4', '/Users/tyiannak/Downloads/soundscape_8k_1sec/test/5'])

Loaded model class mapping: {0: '1', 1: '2', 2: '3', 3: '4', 4: '5'}
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-2-3e59e19efe16> in <module>
----> 1 cr.test_report('/Users/tyiannak/Downloads/soundscape_8k_1s.pt', ['/Users/tyiannak/Downloads/soundscape_8k_1sec/test/1', '/Users/tyiannak/Downloads/soundscape_8k_1sec/test/2/', '/Users/tyiannak/Downloads/soundscape_8k_1sec/test/3', '/Users/tyiannak/Downloads/soundscape_8k_1sec/test/4', '/Users/tyiannak/Downloads/soundscape_8k_1sec/test/5'])

/usr/local/lib/python3.9/site-packages/deep_audio_features/bin/classification_report.py in test_report(model_path, folders)
     58 
     59     max_seq_length = model.max_sequence_length
---> 60     files_test, y_test, class_mapping = load_dataset.load(
     61         folders=folders, test=False,
     62         validation=False, class_mapping=class_mapping)

/usr/local/lib/python3.9/site-packages/deep_audio_features/utils/load_dataset.py in load(folders, test_val, test, validation, class_mapping)
     71         folder2idx = {v: k for k, v in idx2folder.items()}
     72 
---> 73     labels = list(map(lambda x: folder2idx[x], labels))
     74 
     75     class_mapping = {}

/usr/local/lib/python3.9/site-packages/deep_audio_features/utils/load_dataset.py in <lambda>(x)
     71         folder2idx = {v: k for k, v in idx2folder.items()}
     72 
---> 73     labels = list(map(lambda x: folder2idx[x], labels))
     74 
     75     class_mapping = {}

KeyError: '/Users/tyiannak/Downloads/soundscape_8k_1sec/test/1'

if I go to the soundscape_8k_1sec path and then run

cr.test_report('../soundscape_8k_1s.pt', ['test/1', 'test/2/', 'test/3', 'test/4', 'test/5'])

Everything runs ok.

Also if I use the long path in the bin.basic_training script it also runs ok. So probably sth is going wrong with the load_dataset.load(), around the class mapping assignment when classification_report is used.

tyiannak commented 2 years ago

@lobracost ^^

HualinR commented 2 years ago

Hi, I think there is a bug in the classification report. I mainly tried python3 deep_audio_features/bin/classification_report.py -m ./pkl/CNN1_34_Tue_May__3_16:23:48_2022.pt(My pt file address, not full path) -i ~/Own_Datasets/W_real_splitdata/Greathall/W/(My full path of testing folder) The script is always stuck at image when I reset the y_pred to all zeros with the same length of y_true and it runs ok. But when I set the y_pred to all ones with the same length of y_true, the algorithm got stuck. I guess because y_pred has int 1. Then the codes below are broken. y_pred = [label_mapping[label] for label in y_pred]

I also tried the method mentioned above without long paths but I still can not get results. image

Does anyone have the same problem with me? And how do I solve this?

HualinR commented 2 years ago

Hi, I think there is a bug in the classification report. I mainly tried python3 deep_audio_features/bin/classification_report.py -m ./pkl/CNN1_34_Tue_May__3_16:23:48_2022.pt(My pt file address, not full path) -i ~/Own_Datasets/W_real_splitdata/Greathall/W/(My full path of testing folder) The script is always stuck at image when I reset the y_pred to all zeros with the same length of y_true and it runs ok. But when I set the y_pred to all ones with the same length of y_true, the algorithm got stuck. I guess because y_pred has int 1. Then the codes below are broken. y_pred = [label_mapping[label] for label in y_pred]

I also tried the method mentioned above without long paths but I still can not get results. image

Does anyone have the same problem with me? And how do I solve this?

For the 2nd pic I know how to deal with it. Just change the testing destination folder names to class names.

pakoromilas commented 2 years ago

don't use folder names that contain underscore