Open hayduck opened 9 months ago
If the experiment config is the same as the others, with just different input and output frequencies, im happy to give that a shot and make a pr, I just have no idea if there are other changes.
Same here. Looking for the music upscaling model
Hello, I would also be interested in running the model trained on music data. Are there any updates on this?
@hayduck @yihaoch @pf-mpa, the author did answer this question in another issue: https://github.com/slp-rl/aero/issues/5#issuecomment-1513243707
They created a musdb-mixture-11-44.yaml
file in the dset
folder for musdb containing the following:
# @package dset
name: musdb-mixture-11-44
train: egs/musdb18hq/11025-44100_mixture/tr
valid: egs/musdb18hq/11025-44100_mixture/val
test: egs/musdb18hq/11025-44100_mixture/val
It doesn't look like they used an experiment file directly, they instead specified the options as command line arguments like this:
python train.py \
dset=musdb-mixture-11-44 \
experiment=<experiment_name> \
experiment.nfft=512 \
experiment.hop_length=64 \
experiment.lr_sr=11025 \
experiment.hr_sr=44100 \
epochs=696 \
eval_every=175 \
losses=[stft] \
experiment.batch_size=16 \
cross_valid_every=5 \
wandb.resume=false \
experiment.aero.spec_upsample=true \
experiment.upsample=false \
experiment.aero.enc_freq_attn=0 \
experiment.aero.norm_starts=2 \
experiment.aero.dconv_time_attn=2 \
experiment.aero.dconv_lstm=2 \
experiment.aero.freq_ends=4 \
experiment.aero.strides=[4,4,2,2] \
experiment.aero.channels=48 \
experiment.melgan_discriminator.ndf=16 \
+experiment.speech_mode=false \
cross_valid=false \
joint_evaluate_and_enhance=true \
ddp=true \
visqol=false \
note: I am yet not sure if
<experiment_name>
is the file name of a yaml expertiment config that is being overwritten, or thename:
value for the experiment.
I'm currently training another model, but I'll make a pr of a yaml file containing those experiment options when I get around to trying this again.
It would be interesting to upgrade the hdemucs model used by aero to the newest htdemucs
which has a far better SDR.
Training Hint: Consider augmenting your MUSDB18 dataset before running the areo
resample.py
data preparation script. Useful tools:
- demucs
automix.py
(requires local demucs install withpip install -e .
) creates musically plausible mashups- spotify pedalboard can be used for "on-the-fly" augmentations during training: example:
augm_data() function
- audiomentations can also be used for "on-the-fly" audio augmentations (see previous
augm_data()
example)
Hello,
I'm trying to use predict to improve some old music have, as was done here in your project:
Section Ⅴ: Examples for samples upsampled from 11.025kHz to 44.1kHz. The model is trained on the train set of the MusDB-HQ dataset.
but I think I need a msudb experiment yaml file. I was able to download the checkpoint.tf, and tried to use the output naming convention to predict, but there is not a matching experiment yaml file I believe. The dset training hydra config would be nice too if possible.
Thanks much, and cool project.