vvolhejn / thesis

ETH Zürich MSc Thesis: Accelerating Neural Audio Synthesis
Apache License 2.0
17 stars 1 forks source link

Add data augmentation #16

Open vvolhejn opened 2 years ago

vvolhejn commented 2 years ago

DDSP and NEWT papers don't mention it, but RAVE does: "We use dequantization, random crop and allpass filters with random coefficients as our data augmentation strategy."

vvolhejn commented 2 years ago

Random crop is actually tricky to do with the way DDSP is set up because the chunks are pre-cut into four-second segments with a fixed overlap, so this would require a larger change where the cutting would happen during data loading. Additionally, we would need to deal with cutting the pitch and loudness signals too.