Closed felixbur closed 10 months ago
For wav2vec2 and Hubert, it is already resampled on-the-fly if not in 16k,
For others, I provided a Python script to convert to 16k in the emofilm data directory: convert_to_16k.py. It uses sox as backend, and maybe only works on Unix only.
If needed, I propose to use torchaudio.transforms.resample to avoid the need of new requirements.
ok, i needed that for the mos and snr models and can add it there. The disadvantage of course is that a database not being in 16 kHz will be resampled than over and over again, potentially 4 times in one run. So i wonder if we should implement a "resample" module, that would affect the train and test splits of the project, (not the whole databases)
done with 0.62.0
Most models require 16 khz sampling rate, but data might be in other rates, so it'd be nice to automatically resample data