Open vvolhejn opened 2 years ago
Random crop is actually tricky to do with the way DDSP is set up because the chunks are pre-cut into four-second segments with a fixed overlap, so this would require a larger change where the cutting would happen during data loading. Additionally, we would need to deal with cutting the pitch and loudness signals too.
DDSP and NEWT papers don't mention it, but RAVE does: "We use dequantization, random crop and allpass filters with random coefficients as our data augmentation strategy."