There are many great software tools for researchers studying acoustic communication in animals[^1]. But our research groups work with a wide range of different data formats: for audio, for array data, for annotations. This means we write a lot of low-level code to deal with those formats, and then our code for analyses is tightly coupled to those formats. In turn, this makes it hard for other groups to read our code, and it takes a real investment to understand our analyses, workflows and pipelines. It also means that it requires significant work to translate from a pipeline or analysis worked out by a scientist-coder in a Jupyter notebook into a generalized, robust service provided by an application.
In particular, acoustic communication researchers working with the Python programming language face these problems. How can our scripts and libraries talk to each other? Luckily, Python is a great glue language! Let's use it to solve these problems.
The goals of VocalPy are to:
A paper introducing VocalPy and its design has been accepted at Forum Acusticum 2023 as part of the session "Open-source software and cutting-edge applications in bio-acoustics", and will be published in the proceedings.
[^1]: For a curated collection, see https://github.com/rhine3/bioacoustics-software.
vocalpy.Sound
data type>>> import vocalpy as voc
>>> data_dir = ('tests/data-for-tests/source/audio_wav_annot_birdsongrec/Bird0/Wave/')
>>> wav_paths = voc.paths.from_dir(data_dir, 'wav')
>>> audios = [voc.Sound.read(wav_path) for wav_path in wav_paths]
>>> print(audios[0])
vocalpy.Sound(data=array([3.0517...66210938e-04]), samplerate=32000, channels=1),
path = tests / data -
for -tests / source / audio_wav_annot_birdsongrec / Bird0 / Wave / 0.wav)
vocalpy.Spectrogram
data type>>> import vocalpy as voc
>>> data_dir = ('tests/data-for-tests/generated/spect_npz/')
>>> spect_paths = voc.paths.from_dir(data_dir, 'wav.npz')
>>> spects = [voc.Spectrogram.read(spect_path) for spect_path in spect_paths]
>>> print(spects[0])
vocalpy.Spectrogram(data=array([[3.463...7970774e-14]]), frequencies=array([ 0....7.5, 16000. ]), times=array([0.008,...7.648, 7.65 ]),
path=PosixPath('tests/data-for-tests/generated/spect_npz/0.wav.npz'), audio_path=None)
vocalpy.Annotation
data type>>> import vocalpy as voc
>>> data_dir = ('tests/data-for-tests/source/audio_cbin_annot_notmat/gy6or6/032312/')
>>> notmat_paths = voc.paths.from_dir(data_dir, '.not.mat')
>>> annots = [voc.Annotation.read(notmat_path, format='notmat') for notmat_path in notmat_paths]
>>> print(annots[1])
Annotation(data=Annotation(annot_path=PosixPath('tests/data-for-tests/source/audio_cbin_annot_notmat/gy6or6/032312/gy6or6_baseline_230312_0809.141.cbin.not.mat'),
notated_path=PosixPath('tests/data-for-tests/source/audio_cbin_annot_notmat/gy6or6/032312/gy6or6_baseline_230312_0809.141.cbin'),
seq=<Sequence with 57 segments>), path=PosixPath('tests/data-for-tests/source/audio_cbin_annot_notmat/gy6or6/032312/gy6or6_baseline_230312_0809.141.cbin.not.mat'))
Segmenter
for segmentation into sequences of units>>> import evfuncs
>>> import vocalpy as voc
>>> data_dir = ('tests/data-for-tests/source/audio_cbin_annot_notmat/gy6or6/032312/')
>>> cbin_paths = voc.paths.from_dir(data_dir, 'cbin')
>>> audios = [voc.Sound.read(cbin_path) for cbin_path in cbin_paths]
>>> segment_params = {'threshold': 1500, 'min_syl_dur': 0.01, 'min_silent_dur': 0.006}
>>> segmenter = voc.Segmenter(callback=evfuncs.segment_song, segment_params=segment_params)
>>> seqs = segmenter.segment(audios, parallelize=True)
[ ########################################] | 100% Completed | 122.91 ms
>>> print(seqs[1])
Sequence(units=[Unit(onset=2.19075, offset=2.20428125, label='-', audio=None, spectrogram=None),
Unit(onset=2.35478125, offset=2.38815625, label='-', audio=None, spectrogram=None),
Unit(onset=2.8410625, offset=2.86715625, label='-', audio=None, spectrogram=None),
Unit(onset=3.48234375, offset=3.49371875, label='-', audio=None, spectrogram=None),
Unit(onset=3.57021875, offset=3.60296875, label='-', audio=None, spectrogram=None),
Unit(onset=3.64403125, offset=3.67721875, label='-', audio=None, spectrogram=None),
Unit(onset=3.72228125, offset=3.74478125, label='-', audio=None, spectrogram=None),
Unit(onset=3.8036875, offset=3.8158125, label='-', audio=None, spectrogram=None),
Unit(onset=3.82328125, offset=3.83646875, label='-', audio=None, spectrogram=None),
Unit(onset=4.13759375, offset=4.16346875, label='-', audio=None, spectrogram=None),
Unit(onset=4.80278125, offset=4.814, label='-', audio=None, spectrogram=None),
Unit(onset=4.908125, offset=4.922875, label='-', audio=None, spectrogram=None),
Unit(onset=4.9643125, offset=4.992625, label='-', audio=None, spectrogram=None),
Unit(onset=5.039625, offset=5.0506875, label='-', audio=None, spectrogram=None),
Unit(onset=5.10165625, offset=5.1385, label='-', audio=None, spectrogram=None),
Unit(onset=5.146875, offset=5.16203125, label='-', audio=None, spectrogram=None),
Unit(onset=5.46390625, offset=5.49409375, label='-', audio=None, spectrogram=None),
Unit(onset=6.14503125, offset=6.1565625, label='-', audio=None, spectrogram=None),
Unit(onset=6.31003125, offset=6.346125, label='-', audio=None, spectrogram=None),
Unit(onset=6.38996875, offset=6.4018125, label='-', audio=None, spectrogram=None),
Unit(onset=6.46053125, offset=6.4796875, label='-', audio=None, spectrogram=None),
Unit(onset=6.83525, offset=6.8643125, label='-', audio=None, spectrogram=None)], method='segment_song',
segment_params={'threshold': 1500, 'min_syl_dur': 0.01, 'min_silent_dur': 0.006},
audio=vocalpy.Sound(data=None, samplerate=None, channels=None), path=tests / data -
for -tests / source / audio_cbin_annot_notmat / gy6or6 / 032312 / gy6or6_baseline_230312_0809.141.cbin), spectrogram=None)
SpectrogramMaker
for computing spectrograms>>> import vocalpy as voc
>>> wav_paths = voc.paths.from_dir('wav')
>>> audios = [voc.Sound(wav_path) for wav_path in wav_paths]
>>> spect_params = {'fft_size': 512, 'step_size': 64}
>>> spect_maker = voc.SpectrogramMaker(spect_params=spect_params)
>>> spects = spect_maker.make(audios, parallelize=True)
Dataset
s you flexibly build from pipelines and convert to databasesvocalpy.dataset
module contains classes that represent common types of datasets list
of vocalpy.Sequence
s
or vocalpy.Spectrogram
svocalpy
, these datasets capture key metadata from your pipeline:
vocalpy
comes with built-in support for persisting to SQLite,
a lightweight, efficient single-file database format.
It is the only database file format
recommended by the US Library of Congress for archival data,
and it's built into Python
-- no need to install separate database software like MySQLSequenceDataset
for common analyses of sequences of units>>> import evfuncs
>>> import vocalpy as voc
>>> data_dir = 'tests/data-for-tests/source/audio_cbin_annot_notmat/gy6or6/032312/'
>>> cbin_paths = voc.paths.from_dir(data_dir, 'cbin')
>>> audios = [voc.Sound.read(cbin_path) for cbin_path in cbin_paths]
>>> segment_params = {
'threshold': 1500,
'min_syl_dur': 0.01,
'min_silent_dur': 0.006,
}
>>> segmenter = voc.Segmenter(
callback=evfuncs.segment_song,
segment_params=segment_params
)
>>> seqs = segmenter.segment(audios)
>>> seq_dataset = voc.dataset.SequenceDataset(sequences=seqs)
>>> seq_dataset.to_sqlite(db_name='gy6or6-032312.db', replace=True)
>>> print(seq_dataset)
SequenceDataset(sequences=[Sequence(units=[Unit(onset=2.18934375, offset=2.21, label='-', audio=None, spectrogram=None),
Unit(onset=2.346125, offset=2.373125, label='-', audio=None,
spectrogram=None), Unit(onset=2.50471875, offset=2.51546875,
label='-', audio=None, spectrogram=None),
Unit(onset=2.81909375, offset=2.84740625, label='-', audio=None,
spectrogram=None),
...
>>> # test that we can load the dataset
>>> seq_dataset_loaded = voc.dataset.SequenceDataset.from_sqlite(
db_name='gy6or6-032312.db')
>>> seq_dataset_loaded == seq_dataset
True
pip
$ conda create -n vocalpy python=3.10
$ conda activate vocalpy
$ pip install vocalpy
conda
$ conda create -n vocalpy python=3.10
$ conda activate vocalpy
$ conda install vocalpy -c conda-forge
For more detail see Getting Started - Installation
To report a bug or request a feature (such as a new annotation format),
please use the issue tracker on GitHub:
https://github.com/vocalpy/vocalpy/issues
To ask a question about vocalpy, discuss its development,
or share how you are using it,
please start a new topic on the VocalPy forum
with the vocalpy tag:
https://forum.vocalpy.org/
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.
Below we provide some quick links,
but you can learn more about how you can help and give feedback
by reading our Contributing Guide.
To ask a question about vocalpy, discuss its development,
or share how you are using it,
please start a new "Q&A" topic on the VocalPy forum
with the vocalpy tag:
https://forum.vocalpy.org/
To report a bug, or to request a feature,
please use the issue tracker on GitHub:
https://github.com/vocalpy/vocalpy/issues
You can see project history and work in progress in the CHANGELOG
The project is licensed under the BSD license.
If you use vocalpy, please cite the DOI:
Thanks goes to these wonderful people (emoji key):
Ralph Emilio Peterson π€ π π π π» |
Tetsuo Koyama π |
This project follows the all-contributors specification. Contributions of any kind welcome!