sharathadavanne / seld-dcase2021

Baseline method for sound event localization task of DCASE 2021 challenge
Other
39 stars 18 forks source link

Evaluate data set running #9

Closed zyc-c closed 2 years ago

zyc-c commented 2 years ago

Hi, I want to run the network based on evaluate data set, I could extract the features of evaluate data set successfully but the system told me that can not find the metadata file when I run the seld.py. So I have a doubt if I need metadata files when we use the evaluate data set. What should I change next to make the seld.py running successfully if I don not need metadata file? Or we could add metadata file of development data set into the path? Although I don't feel this is right. Looking forward to your reply.

sharathadavanne commented 2 years ago

Hi @zyc-c I dont think I understood your question entirely. From whatever I understood, if you want to compute the results on the evaluation split of the dataset then you will have to run the code in eval mode. To do this please change the flag to eval here

zyc-c commented 2 years ago

Sorry to bother you again.        I changed the 'dev' to 'eval' and the seld.py could run successfully but there are some absence for the test results under evaluate split as shown in Fig1. I doubt that there are only print the metrics for mode 'dev' in seld.py, so I add print operation for 'eval' which are same as 'dev' as shown in Fig 2.  but I can not run the seld.py successfully. The Fig 3 and Fig 4 are specific error for score_obj.get_SELD_Results(dcase_output_test_folder) and score_obj.get_consolidated_SELD_Results(dcase_output_test_folder).       So I want to know if we evaluate the evaluate split only for the validation results. If not, what should I do to get the test results for evaluate split successfully.       The pictures are shown on the attachment.

------------------ 原始邮件 ------------------ 发件人: "sharathadavanne/seld-dcase2021" @.>; 发送时间: 2022年2月23日(星期三) 凌晨0:34 @.>; @.**@.>; 主题: Re: [sharathadavanne/seld-dcase2021] Evaluate data set running (Issue #9)

Hi @zyc-c I dont think I understood your question entirely. From whatever I understood, if you want to compute the results on the evaluation split of the dataset then you will have to run the code in eval mode. To do this please change the flag to eval here

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>

sharathadavanne commented 2 years ago

The images are missing @zyc-c :) You might have to insert them here. I still don't understand what you are trying to achieve. As explained here this baseline repo trains on the training folds - 1,2,3,4, and we use early stopping on the validation fold 5 to choose the best model, before evaluating the performance on the unseen test fold 6. This is what the baseline repo does when you set the mode to mode='dev' here.. However when you set the mode to mode='eval', as seen in the code here we use folds 1,2,3,4,5 for training, fold 6 for early stopping, and finally evaluate it on the unseen evaluation data folds of 7 and 8. These folds 7 and 8 are packaged separately in the dataset and the labels for them are not published yet. These labels are only with the organizers and are used to evaluate all the submissions during the challenge. So there is no way you can compute results on the evaluation folds yourself.

zyc-c commented 2 years ago

Thank you very much!  Actually what I want to get is the test results of evaluation split and I want to compare experience results with the officially shown.  Before your reply, I don't realize that the label is not public and even I misunderstand the split folds because I run the system under the development dataset. Thanks again for your reply.

------------------ 原始邮件 ------------------ 发件人: "sharathadavanne/seld-dcase2021" @.>; 发送时间: 2022年2月24日(星期四) 中午1:46 @.>; @.**@.>; 主题: Re: [sharathadavanne/seld-dcase2021] Evaluate data set running (Issue #9)

The images are missing @zyc-c :) You might have to insert them here. I still don't understand what you are trying to achieve. As explained here this baseline repo trains on the training folds - 1,2,3,4, and we use early stopping on the validation fold 5 to choose the best model, before evaluating the performance on the unseen test fold 6. This is what the baseline repo does when you set the mode to mode='dev' here.. However when you set the mode to mode='eval', as seen in the code here we use folds 1,2,3,4,5 for training, fold 6 for early stopping, and finally evaluate it on the unseen evaluation data folds of 7 and 8. These folds 7 and 8 are packaged separately in the dataset and the labels for them are not published yet. These labels are only with the organizers and are used to evaluate all the submissions during the challenge. So there is no way you can compute results on the evaluation folds yourself.

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>

zyc-c commented 2 years ago

Dear professor,       Sorry to bother you. Actually I am reproducing the baseline CRNN-Multi-accodoa network in DACSE 2022 for FOA dataset. However, I can not obtain the similar result compared with the result in your github. There is a enormous gap for the two results. Later, I found the reply for similar question,but I still do not understand the workflow in details.  Should I mix the synthesis split and real split directly and then operating feature extraction? Or, I should first train the network for synthesis split and save the best model as pretrained model in 'parameter.py' and then test the real split based on the pretrained model?      Could you offer the detailed operation flow in the code when you are free if this is allowed? And I want to know that if the external dataset is necessary to use if i submit the results in DCASE2022?

sharathadavanne commented 2 years ago

Hi @zyc-c can you raise this issue on the DCASE2022 repo, so that it is visible for other participants