This dataset contains audio recordings of 71 different bird species gathered from three different sources (The Macaulay Library, the Art & Science Centre (UCLA), and the Great Himalayan national Park dataset). The full dataset is only ~450MB, and is split up by folder into testing, training, and validation sets. The original recording lengths vary from 0.5 seconds to 320 seconds, and are all sampled at 44.1kHz. However, in the provided dataset only the Mel-spectrograms of the bird calls are provided, and are stored in .npy format.
Closing - the dataset is interesting but we don't want to include in our library a dataset that starts with spectrograms, we would like to start with audio.
This dataset contains audio recordings of 71 different bird species gathered from three different sources (The Macaulay Library, the Art & Science Centre (UCLA), and the Great Himalayan national Park dataset). The full dataset is only ~450MB, and is split up by folder into testing, training, and validation sets. The original recording lengths vary from 0.5 seconds to 320 seconds, and are all sampled at 44.1kHz. However, in the provided dataset only the Mel-spectrograms of the bird calls are provided, and are stored in
.npy
format.The Birdcalls71 dataset was found in the paper (?) by Anshul Thakur et al: "Multiscale CNN based Deep Metric Learning for Bioacoustic Classification: Overcoming Training Data Scarcity Using Dynamic Triplet Loss" (still under review) (I suspect the researchers are the people who curated this dataset, though I'm not completely sure).
┆Issue is synchronized with this Asana task by Unito