Open Yuening-Ma opened 5 months ago
Good point, for version 1.0.0 of the dataset (which we did not publish with ), we used the original data and labels, but in version 2.0.0 we added re-annotations for cough and sneeze and removed files, that were marked as bad audio/containing other sound classes, compare https://github.com/audeering/cough-speech-sneeze/blob/main/2.0.0/publish.py.audb
You don't have to add a citation for audb
, but if you want, you could use https://arxiv.org/abs/2303.00645
BTW, the raw labels of our re-annotation are available at https://github.com/audeering/cough-speech-sneeze/blob/main/2.0.0/annotations/20210412-102437-cough-sneeze/20210412-102437_cough-and-sneeze_annotations-cough_sneeze.csv.
You can also load the dataset with the original labels:
>>> audb.versions("cough-speech-sneeze")
['1.0.0', '2.0.0', '2.0.1']
So, if you do:
>>> db = audb.load("cough-speech-sneeze", version="1.0.0")
You should get the original data and labels.
Much thanks for your very timely reply! You really did a very solid job, the 2.0 version of the dataset I downloaded is quite clean! I will read the code and anno file for more detail.
hello, I'm training a model for cough detection and I would like to use cough-speech-sneeze dataset in audeering datasets.
I find the dataset description on this page: Dataset based on the publication of Shahin Amiriparian: “Amiriparian, S., Pugachevskiy, S., Cummins, N., Hantke, S., Pohjalainen, J., Keren, G., Schuller, B., 2017. CAST a database: Rapid targeted large-scale big data acquisition via small-world modelling of social media platforms, in: 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE, pp. 340–345. https://doi.org/10.1109/ACII.2017.8273622”
I have downloaded the dataset using audb code (many thanks for the data!), however, I would like to know: Is this the original dataset created by the authors of the paper above (Amiriparian et. al.)? Or the authors of audeering have organized and modified the data? I need this infomation so that I can elaborate the data source correctly in my paper. Thanks again!
PS: How should I cite your work if I download the data with audb?