Closed suncerock closed 10 months ago
Hi,
If I remember correctly, the results on the table are using the cleaned version (21k clips in total). Due to the data spli discrepancy described in the paper, we reproduced all the results using this new split.
Hi, thank you for your great work and for building benchmark results for all the representative auto-tagging models. I have a further question on the data splitting of the MTAT dataset.
In the SMC paper, you mentioned that you did not discard the tracks with no associate labels (which might lead to performance decay). However, both the split
npy
files in this repo and also the split files in this repo you referred to in the SMC paper discard those tracks away. Could I kindly ask whether the results are based on the cleaned version of the dataset which discard those tracks?For your reference, the original version should have 18706 tracks for training, 1825 for validation, and 5329 for testing (25860 in total). The clean version should have 15247 for training, 1529 for validation, and 4332 for testing (21108 in total).