Open chester-w-xie opened 2 years ago
Thanks for pointing this out!
I confirmed that there are duplicates in FSD_MIX_CLIPS.annotations. This results from the process of chunking 1-sec windows around all sound events in each clip. When multiple sound events are closely located in time, we could chunk the same window multiple times.
Specifically, the number of duplicates in each split are: Base-train: 1145 Base-val: 146 Base-test: 139
Novel-val: 35 Novel-test: 20
Results in a total of 1485 duplicates as observed.
Overwriting shouldn't be a problem since these are just duplicates of the same time window. We should instead update the annotation files and dataset description.
Hi, can you please upload the updated annotation files? Thank you very much!
I have written a script to solve this problem, see https://github.com/wangyu/rethink-audio-fsl/pull/19
I run the following command to get OpenL3 embeddings of clips in FSD-MIX-CLIPS : python get_openl3emb_and_filelist.py \ --annpath PATH-TO-FSD_MIX_CLIPS.annotations \ --audiopath PATH-TO-FSD_MIX_SED.audio \ --savepath PATH_TO_SAVE_OUTPUT
I have counted the number of .pkl files generated by the program and the results are as follows Base-train: 448,123 Base-val: 65,520 Base-test: 65,422
Novel-val: 17,347 Novel-test: 16,636
The total number is 613,048, not 614,533
My guess is that some annotations in FSD_MIX_CLIPS.annotations may have overlapped and therefore overwritten the file during program execution.
Therefore, I made a small change to line 25 of file get_openl3emb_and_filelist.py:
outfile = join(savefolder, fname.replace('.wav', '_' + str(startsample) + '' + str(idx) + '.pkl'))
The number of files obtained after rerunning the code is then consistent with what is described in the paper.
Maybe you can check if the annotation information in FSD_MIX_CLIPS.annotations does overlap. Thank you,