Closed chamathsilva closed 5 months ago
Thank you for your comment. While the mentioned issue is meaningless in the Audio Annotation task, a 10-fold cross-validation is necessary to compare a proposed Audio Classifier with similar prior works. But, as it is clear from our paper and our report in the paperswithcode.com, we never intended to compare our results with works establishing 10-fold cross-validation. The reason behind that is our work focuses on the demonstration of using a Context-aware approach to solve Audio Annotation and Classification Tasks.
Even though this work doesn't want to compare the results with works establishing 10-fold cross-validation, the dataset explicitly mentions the following point, which should be taken into consideration:
"If you reshuffle the data (e.g., combine the data from all folds and generate a random train/test split), you will be incorrectly placing related samples in both the train and test sets. This can lead to inflated scores that do not accurately represent your model's performance on unseen data. In other words, your results may be misleading or incorrect."
It is crucial to understand the importance of preserving the integrity of the data during the train/test split process. Failing to do so can result in erroneously including related samples in both the training and testing sets, which can artificially inflate the performance scores of the model.
I have a similar comments here, how can you make sure there is no leakage between the validation and training sets considering extracting and learning features from all the dataset.
Thank you for your comment. As explained in the paper and evident in the implemented code, the test set, validation set, and the trainset are entirely distinct, with no overlap among them. Regarding the preservation of default dataset datafolds, it depends on the objective: if the aim is to address the issue of general classification, then yes; however, if the goal is to tackle the problem of context-aware classification, then no. Default folds introduce a challenging corner case, which can be beneficial for addressing the classification problem. However, when employing a context-aware method, it's essential to examine the normal case rather than the corner case, as a result of the Central Limit Theorem. In fact, my paper aims to demonstrate how solving machine-learning problems through a context-aware approach can enhance problem-solving accuracy.
Under the URBANSOUND8K DATASET, they have specifically mentioned the following.
More details : https://urbansounddataset.weebly.com/urbansound8k.html