iver56 / audiomentations

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
https://iver56.github.io/audiomentations/
MIT License
1.76k stars 183 forks source link

Add support for librosa up to 0.10.0.post2 #281

Closed iver56 closed 1 year ago

iver56 commented 1 year ago

excluding 0.10.0, which had a bug in pitch shift

close #278

iver56 commented 1 year ago

with librosa 0.9.1

AddBackgroundNoiseRelative       0.069 s (std: 0.081 s)
AddBackgroundNoiseAbsolute       0.073 s (std: 0.086 s)
AddBackgroundNoiseWithTransform  0.068 s (std: 0.080 s)
AddGaussianNoise                 0.010 s (std: 0.000 s)
AddGaussianSNR                   0.013 s (std: 0.002 s)
ApplyImpulseResponseWithTail     0.032 s
ApplyImpulseResponseLeaveLengthUnchanged 0.032 s
AddShortNoisesAbsolute           0.017 s (std: 0.007 s)
AddShortNoisesRelative           0.014 s (std: 0.011 s)
AddShortNoisesWithSignalGain     0.039 s (std: 0.015 s)
AddShortNoisesWithNoiseTransform 4.740 s (std: 2.308 s)
AdjustDuration                   0.000 s (std: 0.000 s)
AdjustDurationPadEndSilence      0.001 s
AdjustDurationPadStartSilence    0.001 s
AdjustDurationPadStartWrap       0.001 s
AdjustDurationPadStartReflect    0.001 s
BandPassFilter                   0.006 s (std: 0.001 s)
BandStopFilter                   0.006 s (std: 0.000 s)
ClippingDistortion               0.003 s (std: 0.001 s)
Gain                             0.001 s (std: 0.000 s)
GainTransition                   0.003 s (std: 0.001 s)
HighPassFilter                   0.005 s (std: 0.000 s)
HighShelfFilter                  0.005 s (std: 0.000 s)
LowPassFilter                    0.005 s (std: 0.000 s)
LowShelfFilter                   0.005 s (std: 0.001 s)
PitchShift                       0.601 s (std: 0.064 s)
Lambda                           0.001 s
Limiter                          0.005 s (std: 0.000 s)
LoudnessNormalization            0.015 s (std: 0.000 s)
Mp3CompressionLameenc            0.205 s (std: 0.083 s)
Mp3CompressionPydub              0.156 s (std: 0.011 s)
Normalize                        0.002 s
PaddingSilenceEnd                0.001 s (std: 0.000 s)
PaddingWrapEnd                   0.001 s (std: 0.000 s)
PaddingReflectEnd                0.001 s (std: 0.000 s)
PaddingSilenceStart              0.000 s (std: 0.001 s)
PaddingWrapStart                 0.001 s (std: 0.000 s)
PeakingFilter                    0.004 s (std: 0.000 s)
PolarityInversion                0.000 s
Resample                         0.469 s (std: 0.057 s)
Reverse                          0.000 s
RoomSimulator                    0.384 s (std: 0.116 s)
SevenBandParametricEQ            0.031 s (std: 0.001 s)
ShiftWithoutFade                 0.001 s (std: 0.000 s)
ShiftWithShortFade               0.001 s (std: 0.000 s)
ShiftWithoutRolloverWithLongFade 0.001 s (std: 0.000 s)
TanhDistortion                   0.010 s (std: 0.001 s)
TimeMask                         0.001 s (std: 0.000 s)
TimeStretch                      0.184 s (std: 0.020 s)
Trim                             0.006 s
BigCompose                       0.274 s (std: 0.343 s)
AirAbsorption                    0.046 s (std: 0.002 s)

with librosa 0.10.post1

AddBackgroundNoiseRelative       0.101 s (std: 0.120 s)
AddBackgroundNoiseAbsolute       0.100 s (std: 0.120 s)
AddBackgroundNoiseWithTransform  0.099 s (std: 0.117 s)
AddGaussianNoise                 0.011 s (std: 0.000 s)
AddGaussianSNR                   0.013 s (std: 0.002 s)
ApplyImpulseResponseWithTail     0.122 s
ApplyImpulseResponseLeaveLengthUnchanged 0.116 s
AddShortNoisesAbsolute           0.578 s (std: 0.251 s)
AddShortNoisesRelative           0.403 s (std: 0.249 s)
AddShortNoisesWithSignalGain     0.771 s (std: 0.246 s)
AddShortNoisesWithNoiseTransform 5.810 s (std: 2.494 s)
AdjustDuration                   0.000 s (std: 0.000 s)
AdjustDurationPadEndSilence      0.001 s
AdjustDurationPadStartSilence    0.001 s
AdjustDurationPadStartWrap       0.001 s
AdjustDurationPadStartReflect    0.001 s
BandPassFilter                   0.006 s (std: 0.000 s)
BandStopFilter                   0.006 s (std: 0.000 s)
ClippingDistortion               0.003 s (std: 0.000 s)
Gain                             0.001 s (std: 0.000 s)
GainTransition                   0.003 s (std: 0.001 s)
HighPassFilter                   0.005 s (std: 0.000 s)
HighShelfFilter                  0.004 s (std: 0.000 s)
LowPassFilter                    0.005 s (std: 0.001 s)
LowShelfFilter                   0.004 s (std: 0.000 s)
PitchShift                       0.149 s (std: 0.008 s)
Lambda                           0.000 s
Limiter                          0.005 s (std: 0.001 s)
LoudnessNormalization            0.016 s (std: 0.001 s)
Mp3CompressionLameenc            0.137 s (std: 0.008 s)
Mp3CompressionPydub              0.243 s (std: 0.010 s)
Normalize                        0.001 s
PaddingSilenceEnd                0.001 s (std: 0.000 s)
PaddingWrapEnd                   0.001 s (std: 0.000 s)
PaddingReflectEnd                0.001 s (std: 0.000 s)
PaddingSilenceStart              0.000 s (std: 0.000 s)
PaddingWrapStart                 0.001 s (std: 0.000 s)
PeakingFilter                    0.004 s (std: 0.000 s)
PolarityInversion                0.001 s
Resample                         0.005 s (std: 0.001 s)
Reverse                          0.000 s
RoomSimulator                    0.378 s (std: 0.112 s)
SevenBandParametricEQ            0.031 s (std: 0.002 s)
ShiftWithoutFade                 0.001 s (std: 0.000 s)
ShiftWithShortFade               0.001 s (std: 0.000 s)
ShiftWithoutRolloverWithLongFade 0.001 s (std: 0.000 s)
TanhDistortion                   0.007 s (std: 0.000 s)
TimeMask                         0.001 s (std: 0.000 s)
TimeStretch                      0.151 s (std: 0.017 s)
Trim                             0.024 s
BigCompose                       0.109 s (std: 0.118 s)
AirAbsorption                    0.045 s (std: 0.001 s)