acids-ircam / RAVE

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder
Other
1.28k stars 175 forks source link

resample before random crop in dataset? #301

Open victor-shepardson opened 6 months ago

victor-shepardson commented 6 months ago

https://github.com/acids-ircam/RAVE/blob/8b250310fecfe61a6d9d53e8e5551851f4638d35/rave/dataset.py#L239

should resampling be moved to before random cropping here? currently it seems to result in batches which aren't multiples of the latent downsampling factor when converting 44.1/48k.

I notice it also assumes that datasets processed at 48k with older versions of RAVE (where the sample rate wasn't stored as metadata) are 44.1 and tries to convert them, which is confusing for anyone who has been using 48k and upgrades. maybe it should do no conversion by default if the data same rate isn't known, and print a warning? I did it like this on my fork: https://github.com/victor-shepardson/RAVE/blob/ff2218369f1589b06587bb58f37a609dc483d464/rave/dataset.py#L336

domkirke commented 4 months ago

You're absolutely right! I will fix that in next version, thanks for the issue :)