Open abfleishman opened 4 months ago
@abfleishman I believe the dataset used is https://ieee-dataport.org/documents/dominica-dataset#files
@khughitt interesting! It looks like I have to pay to access it? does that sound correct?
@abfleishman It appears so. The data itself is open and has a CC license, which is good. I reached out to the listed author to see if it is available elsewhere, and if not, if they would consider uploading it to Zenodo or another open data repository.
That dataset contains only echolocation click recordings, no codas. That's not the dataset used in this paper.
@anmoisio The question was about the source of the original recordings, which I believe is what is hosted on the IEEE link I shared, although I could be wrong. You are correct that the paper does not start from the raw data, but instead builds on a couple of different processed coda tabular data files which are hosted in the repo at https://github.com/pratyushasharma/sw-combinatoriality/tree/main/data.
If I've understood correctly, codas are the sequences of clicks these whales use for communicating, while echolocation clicks are different. The other details don't match either: in the paper they say the data is from years between 2005 and 2018, and the linked dataset is collected in September 2023.
@anmoisio Ah, good catch! You are correct! The dates don't line up and the dataset does appear to be focused on echolocation clicks and not codas. The author of the IEEE dataset discusses both of these here: https://arxiv.org/pdf/2401.00900.
In that case, if the original audio recordings are publicly available, I'm not sure where. I couldn't find anything else online or via the Dominica Sperm Whale Project website.
@khughitt I couldn't find the correct dataset either, unfortunately
Are the acoustic recordings associated with this paper available somewhere? Reading the data availability statement in the paper I expected to find them in this GitHub repository.