wangyu / rethink-audio-fsl

Who calls the shots? Rethinking Few-Shot Learning for Audio (WASPAA 2021)
MIT License
40 stars 6 forks source link

There may be some duplication of annotation information #12

Open chester-w-xie opened 2 years ago

chester-w-xie commented 2 years ago

I run the following command to get OpenL3 embeddings of clips in FSD-MIX-CLIPS : python get_openl3emb_and_filelist.py \ --annpath PATH-TO-FSD_MIX_CLIPS.annotations \ --audiopath PATH-TO-FSD_MIX_SED.audio \ --savepath PATH_TO_SAVE_OUTPUT

I have counted the number of .pkl files generated by the program and the results are as follows Base-train: 448,123 Base-val: 65,520 Base-test: 65,422

Novel-val: 17,347 Novel-test: 16,636

The total number is 613,048, not 614,533

My guess is that some annotations in FSD_MIX_CLIPS.annotations may have overlapped and therefore overwritten the file during program execution.

Therefore, I made a small change to line 25 of file get_openl3emb_and_filelist.py:

outfile = join(savefolder, fname.replace('.wav', '_' + str(startsample) + '' + str(idx) + '.pkl'))

The number of files obtained after rerunning the code is then consistent with what is described in the paper.

Maybe you can check if the annotation information in FSD_MIX_CLIPS.annotations does overlap. Thank you,

wangyu commented 2 years ago

Thanks for pointing this out!

I confirmed that there are duplicates in FSD_MIX_CLIPS.annotations. This results from the process of chunking 1-sec windows around all sound events in each clip. When multiple sound events are closely located in time, we could chunk the same window multiple times.

Specifically, the number of duplicates in each split are: Base-train: 1145 Base-val: 146 Base-test: 139

Novel-val: 35 Novel-test: 20

Results in a total of 1485 duplicates as observed.

Overwriting shouldn't be a problem since these are just duplicates of the same time window. We should instead update the annotation files and dataset description.

chester-w-xie commented 2 years ago

Hi, can you please upload the updated annotation files? Thank you very much!

chester-w-xie commented 2 years ago

I have written a script to solve this problem, see https://github.com/wangyu/rethink-audio-fsl/pull/19