sprague252 / SVSound

GNU General Public License v3.0
0 stars 0 forks source link

SVSound

This is Python package for reading Broadcast Wave Files in various formats along with metadata written by several recording devices. The content in this module was forked from Mark Sprague's collection of sound recording and analysis modules and intended for use by his students. This package is available to everyone under the GPL 3.0 license.

Versions

Installation

Conda/Mamba

Install using conda:

conda install -c sprague252 svsound

Install using mamba:

mamba install -c sprague252 svsound

Pip

Install using pip:

pip install svsound

wavefile Module

The module wavefile contains programs for reading Broadcast Wave (.wav) files. The following propriatory boadcast wave file formats are currently supported:

Functions

read( )

info, wave = read(filename, t0, t1, wavetype, chunk_b, verbose)

Read a WAV file and return the file information and waveform data. This function includes support for single and multiple channel files encoded in linear PCM format with the following data formats (all little-endian):

Input parameters

filename - string with the name of the input WAV file

t0 - start time in seconds for returned data (default: 0)

t1 - end time in seconds for returned data. Value of -1 represents the end of the file. (default: -1)

wavetype - string representing the type of WAV file (default: None). Currnetly supported types are 'generic', 'AudioMoth', 'decimus', 'icListen', and 'zoom'. If the value is None, the wavetype is determined using identify.

chunk_b - number of bytes for each data chunk read from the file (default: 3072)

verbose - give verbose status updates (default: False)

Output

info - dictionary with file information and metadata (if available)

wave - Numpy array with waveform data values. For a single channel file, wave is a flat, 1-D array. For a multichannel recording each channel is a row in wave, so wave[0] is the first channel, wave[1] the second channel, etc.

identify( )

wavetype = identify(file)

Identify the type of WAV file and return its type. Files that are unable to be identified are classified as generic. The wave type identification allows the extraction of proprietary metadata stored in the file and filename.

Input parameter

file - filehandle for the WAV file to be identified

Output

wavetype - string with the name of the wave file type.

wave_chunk( )

Read a WAVE file in chunks (not all at once) and return all the data. This is a back-end to the read function and is not intended for high-level use.

recorders Subpackage

The subpackage recorders contains modules with specific get_info() functions for each supported recorder type. Currently supported recorders are described in the wavefile Module introduction (above). Each get_info() function has the same input and output parameters and usage.

info = get_info(file, info)

Read the information in a generic WAV file, and return the contents. Only the standard information in the fmt chunk is included in the info dictionary.

Input Parameters

file - filehandle of an open WAV file

info - (optional) dictionary that may contain file information from other sources. Defaults to an empty dictionary.

Output

info - dictionary with information read from the file. If an info dictionary was supplied as an input parameter, entires that were not changed are also included.

Standard info dictionary keys and values returned for all file types:

"bits" - integer with the number of bits in each sample.

"block_align" - number of bytes sampled at the same time (all channels combined) in the data

"byte_per_s" - integer number of bytes per second recorded

"chan" - integer number of channels in the file

"compress" - integer Wave file compression index. Only 1 (uncompressed integer data) and 3 (uncompressed floating point data) are currently supported.

"data0" - integer byte address of the first sample in the file

"filesize" - integer size of the file in bytes

"fs" - integer sample rate in samples/second

"Nsamples" - integer number of samples in the file (in each channel)

"wavetype" - string with the name file type read.

Other keys and values in the info dictionary are recorder-specific and depend on the wavetype value.

Recorder-Specific info keys and values

AudioMoth

Recordings identified as AudioMoth recordings have info["wavetype"] set to "AudioMoth". In addition to the standard info parameters, the following metadata parameters are added:

"ICMT" - string with the contents of the ICMT subchunk.

"IART" - string with the contents of the IART subchunk.

"datestring" - string with the date and time of the beginning of the recording in ISO 8601 format.

"voltage" - string with the battery voltage at the beginning of the recording.

"gain" - string with the AudioMoth gain setting for recording.

"serial number" - string with the serial number of the AudioMoth recording device.

Decimus

Recordings identified as Decimus recordings have info["wavetype"] set to "decimus". Otherwise, info contains only the standard info keys and values.

Generic

Recordings classified as generic have info["wavetype"] set to "generic", and info contains only the standard info keys and values.

icListen

Recordings identified as icListen recordings have info["wavetype"] set to "icListen". In addition, each key/value pair encoded in the INFO chunk in the file is added to info. See the icListen documentation for details on these parameters.

The value info["cal"] contains a float64 calibration value for the data. Multiply data samples by this value to obtain calibrated values in micropascals.

Zoom

Recordings identified as Zoom recordings have info["wavetype"] set to "zoom". The following information encoded in the bext chunk is added to info as keys and values. (See Zoom documentation for details.)

"CodingHistory" - coding history string

"desc" - recording description string

"LoudnessRange" - int16 recording loudness range value

"LoudnessValue" - int16 recording loudness value

"MaxMomentaryLoudness" - int16 recording maximum momentary loudness value

"MaxShortTermLoudness" - int16 recording maximum short term loudness value

"MaxTruePeakLevel" - int16 recording maximum maximum true peak level

"OriginationDate" - recording origination date string

"OriginationTime" - recording origination time string

"Originator" - recording originator string

"OriginatorReference" - recording originator reference string

"TimeReferenceHigh" - int32 time of high sample in recording

"TimeReferenceLow" - int32 time of low sample in recording

"UMID" - UMID string

"Version" - int16 version number

The contents in the entire iXML block are stored in info["iXML"] as a string.

levels Module

The module levels contains the functions spl, sel, spl_wav, spl_wav_dir, spl_wav_files, A_weighting, M_weighting, and weight. Each funcion contains a detailed usage message.

spl

Return an array of sampled sound pressure levels using time constant T.

Usage

    SPL = spl(data, fs, weighting='A', tconst=0.125, pref=20.0)

Input Parameters

data: an array of sampled sound pressures.

fs: sampling frequency in hertz.

weighting: type of weighting to use. This parameter can be a string to represent preset values 'A' for A-weighting, 'M' for M-weighting (see documentation on the function weight() to set frequency parameters). It can also be a function that provides digital filter parameters to the weight() function. For no weighting, use weighting = 1. The default is 'A' weighting.

tconst: time constant. Defaullts to 0.125 s (fast). This parameter can be 
    the value in seconds or preset values given with the strings 'Fast' 
    (0.125 s), 'Slow' (1.000 s), or "Impulse' (0.035 s).
pref: reference pressure. Defaults to 20.0 (micropascals, standard for
    atmospheric sounds). Use 1.0 (micropascals) for underwater sounds.
cal: calibration factor of the recording. This is the value that
    converts data samples to the appropriate pressure units
    (micropascals). The default value is 1 (no calibration
    adjustment).
pms: an initial value for the mean square pressure 'historical'
    value for time constant. Use this to continue the calculation
    from another recording. Defaults to 0.0.
pms_return: whether or not to return the mean square pressure value for 
    subsequent calculations. Defaults to False.

Output

SPL: a numpy array of sampled sound pressure levels corresponding to the 
    the same sampling times a the elements of data.  Note that the initial 
    elements SPL[i] are based on a truncated history because they only use 
    pressure values from data[i] back to data[0].
pms: The mean square sound pressure for use in subsequent
    calculations such as the recording continuing in another
    file. Only returned if the input parameter pms_return is True.

Usage Example

Read data from a single-channel file and plot it vs. time.

>>> from __future__ import division
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> from SVSound import wavefile
>>> info, data = wavefile.read('filename.wav')
>>> info['chan']
1
>>> times = np.arange(data.size / info['fs'])
>>> plt.plot(times, data)
...

Note that the data in a multichannel recording has rows for each channel, so data[0] is the first channel, data[1] the second channel, etc.