Bark is:
Version: 0.2
By emphasizing filesystem directories, plain text files and a common binary array format, Bark makes it easy to use both large external projects and simple command-line utilities.
Bark's small specification and Python implementation are easy to use in custom tools.
These tools can be chained together using GNU Make to build data pipelines.
Inspired by ARF, Bark uses a hierarchy of common data storage formats. The advantages of this approach are:
Bark trees are made from the following elements:
meta.yaml
file and any number
of Datasets.This repository contains:
The python interface runs under Python 3.5 through 3.8. Installation with Conda is recommended.
git clone https://github.com/margoliashlab/bark
cd bark
pip install -r requirements.txt
pip install .
# optional tests
pytest -v
These installation instructions cover the main bark library and almost all of the conversion scripts and command-line data manipulation tools. Exceptions are noted below.
The requirements file omits dependencies for a few optional graphical tools included in this repository. Their additional requirements are as follows, and are not shared across them. If you don't intend to use one, you can ignore its requirements.
bark-label-view
(for hand-labeling audio data), requires:
resin
bark-label-view
is
perfectly usable without it)bark-psg-view
(for hand-scoring PSG data), requires:
bark-scope
opens a sampled data file in neuroscope.
It obviously requires an installation of neuroscope.
Note for MacOS users: you need to link the installed neuroscope to where bark-scope
expects to find it:
$ ln -s /Applications/neuroscope.app/Contents/MacOS/neuroscope /usr/local/bin/neuroscope
Finally, Sox is also extremely useful for working
with audio data. One conversion routine, dat-to-audio
, is a wrapper around Sox, and thus
requires it to be installed.
Every command has help accessible with the flag -h
(e.g. bark-entry -h
).
bark-entry
-- create entry directories for datasetsbark-attribute
-- create or modify an attribute of a bark entry or datasetbark-column-attribute
-- create or modify an attribute of a bark dataset columnbark-clean-orphan-metas
-- remove orphan .meta.yaml
files without associated data filesdat-select
-- extract a subset of channels from a sampled datasetdat-join
-- combine the channels of two or more sampled datasetsdat-split
-- extract a subset of samples from a sampled datasetdat-cat
-- concatenate sampled datasets, adding more samplesdat-filter
-- apply zero-phase Butterworth or Bessel filters to a sampled datasetdat-decimate
-- down-sample a sampled dataset by an integer factor, you want to low-pass filter your data first.dat-diff
-- subtract one sampled dataset channel from anotherdat-ref
-- for each channel: subtract the mean of all other channels, scaled by a coefficient such that the total power is minimizeddat-artifact
-- removes sections of a sampled dataset that exceed a thresholddat-enrich
-- concatenates subsets of a sampled dataset based on events in an events datasetdat-spike-detect
-- detects spike events in the channels of a sampled datasetdat-envelope-classify
-- classifies acoustic events, such as stimuli, by amplitude envelopedat-segment
-- segments a sampled dataset based on a band of spectral power, as described in Koumura & OkanoyaThere are many external tools for processing CSV files, including pandas and csvkit.
bark-scope
-- opens a sampled data file in neuroscope. (Requires an installation of neuroscope) bark-label-view
-- Annotate or review events in relation to a sampled dataset, such as birdsong syllable labels on a microphone recording.bark-psg-view
-- Annotate or review on mutiply channels of .dat files. bark-db
-- adds the metadata from a Bark tree to a databasebark-convert-rhd
-- converts Intan .rhd files to datasets in a Bark entrybark-convert-openephys
-- converts a folder of Open-Ephys .kwd files to datasets in a Bark entrybark-convert-arf
-- converts an ARF file to entries in a Bark Rootbark-convert-spyking
-- converts Spyking Circus spike-sorted event data to a Bark event datasetbark-convert-mountainsort
-- converts MountainSort spike-sorted data to a Bark event datasetcsv-from-waveclus
-- converts a wave_clus spike time file to a CSVcsv-from-textgrid
-- converts a praat TextGrid file to a CSVcsv-from-lbl
-- converts an aplot lbl file to a CSVcsv-from-plexon-csv
-- converts a Plexon OFS waveform CSV to a bark CSVdat-to-wave-clus
-- convert a sampled dataset to a wave_clus
compatible Matlab filedat-to-audio
-- convert a sampled dataset to an audio file. Uses SOX under the hood, and so it can convert to any file type SOX supports.dat-to-mda
-- convert a Bark sampled dataset to a MountainSort-compatible .mda
filebark-for-each
-- apply a command to a list of Entries.More tools with less generality can be found in the bark-extra repository.
import bark
root = bark.read_root("black5")
root.entries.keys()
# dict_keys(['2016-01-18', '2016-01-19', '2016-01-17', '2016-01-20', '2016-01-21'])
entry = root['2016-01-18']
entry.attrs
# {'bird': 'black5',
# 'experiment': 'hvc_syrinx_screwdrive',
# 'experimenter': 'kjbrown',
# 'timestamp': '2017-02-27T11:03:21.095541-06:00',
# 'uuid': 'a53d24af-ac13-4eb3-b5f4-0600a14bb7b0'}
entry.datasets.keys()
# dict_keys(['enr_emg.dat', 'enr_mic.dat', 'enr_emg_times.csv', 'enr_hvc.dat', 'raw.label', 'enr_hvc_times.csv', 'enr.label'])
hvc = entry['enr_hvc.dat']
hvc.data.shape
# (7604129, 3)
The Stream
object in the bark.stream
module exposes a powerful data pipeline design system for sampled data.
Example usage:
Some links to get started with Make:
Dan Meliza created ARF. Bark was was written by Kyler Brown so he could finish his damn thesis in 2017. Graham Fetterman also made significant contributions.