Closed ilya16 closed 11 months ago
@ilya16 yes indeed that does not seem right :disappointed:
decided to take the strategy of doing all the resampling + curtail / pad on the highest target sample freq first, before resampling to all the rest of the target sample freqs
want to see if that addresses the issue?
@lucidrains looks good!
When audio lengths are greater than
max_length
and multiple target sample rates are used, theSoundDataset
samples audios with different start positions: https://github.com/lucidrains/audiolm-pytorch/blob/c65bb97662a1ef29ec6359d25bf4022c2cb82a27/audiolm_pytorch/data.py#L86-L97Affects the training data for
CoarseTransformer
.