audeering / auglib

Data augmentation for audio
https://audeering.github.io/auglib/
Other
10 stars 0 forks source link

Handle read-only input signals #31

Closed audeerington closed 2 weeks ago

audeerington commented 1 month ago

A np.ndarray can be flagged as read-only with the writeable flag.

The example in the documentation using sox for pitch shift creates such a signal. This can lead to an error when combining multiple transforms, for example:

import numpy as np

import audb
import auglib
import audiofile

def sox_transform(
    signal: np.array,
    sampling_rate: int,
):
    import sox

    tfm = sox.Transformer()
    tfm.pitch(2)
    signal_augmented = tfm.build_array(
        input_array=signal.squeeze(),
        sample_rate_in=sampling_rate,
    )
    return signal_augmented

transform = auglib.transform.Compose(
    [
        auglib.transform.Function(
            sox_transform,
        ),
        auglib.transform.WhiteNoiseGaussian(snr_db=20),
    ]
)
files = audb.load_media(
    "emodb",
    "wav/03a01Fa.wav",
    version="1.4.1",
    verbose=False,
)
signal, sampling_rate = audiofile.read(files[0])
signal_aug = transform(signal, sampling_rate)

which leads to the error

...
    signal += from_db(gain_db) * noise_generator.normal(0, stddev, signal.shape)
ValueError: output array is read-only

This could probably be solved by not using += but signal = signal + for all transforms.

hagenw commented 1 month ago

Interesting, thanks for reporting this obscure case ;)

hagenw commented 2 weeks ago

I tried to change the flag of the array directly inside auglib.transform.Function() with:

signal.setflags(write=True)

but there seems to be a reason for the flag as this fails with

E       ValueError: cannot set WRITEABLE flag to True of this array

So I guess we need to avoid to use += in all other transforms. Affected transforms are: