Open JCBrouwer opened 4 years ago
Can you give more details on your initial config/run command and the one used for restarting the job? Are you warmstarting from the pretrained checkpoint but adding a new loss?
Yes I want to warmstart with the pretrained checkpoint. Although I get the same error when training from scratch with the crepe embedding loss added in ae.gin.
My original training command:
python -m ddsp.training.ddsp_run \
--mode=train \
--alsologtostderr \
--save_dir="/home/hans/modelzoo/neuro-bass-ddsp-48kHz/" \
--gin_file=models/solo_instrument.gin \
--gin_file=datasets/tfrecord.gin \
--gin_param="TFRecordProvider.file_pattern='/home/hans/datasets/neuro-bass-ddsp/48kHz/train.tfrecord*'" \
--gin_param="batch_size=16" \
--gin_param="train_util.train.num_steps=300000" \
--gin_param="train_util.train.steps_per_save=3000" \
--gin_param="trainers.Trainer.checkpoints_to_keep=10" \
--gin_param="TFRecordProvider.example_secs=4" \
--gin_param="TFRecordProvider.sample_rate=48000" \
--gin_param="TFRecordProvider.frame_rate=250" \
--gin_param="Additive.n_samples=192000" \
--gin_param="Additive.sample_rate=48000" \
--gin_param="FilteredNoise.n_samples=192000"
Then after having trained overnight, I've added PretrainedCREPEEmbeddingLoss() in ae.gin (which solo_instrument.gin inherits from):
Autoencoder.losses = [
@losses.SpectralLoss(),
@losses.PretrainedCREPEEmbeddingLoss(),
]
Then I'm running and getting the error (the error is the same with or without --restore_dir):
python -m ddsp.training.ddsp_run \
--mode=train \
--alsologtostderr \
--save_dir="/home/hans/modelzoo/neuro-bass-ddsp-48kHz-crepe/" \
--restore_dir="/home/hans/modelzoo/neuro-bass-ddsp-48kHz/" \
--gin_file=models/solo_instrument.gin \
--gin_file=datasets/tfrecord.gin \
--gin_param="TFRecordProvider.file_pattern='/home/hans/datasets/neuro-bass-ddsp/48kHz/train.tfrecord*'" \
--gin_param="batch_size=16" \
--gin_param="train_util.train.num_steps=300000" \
--gin_param="train_util.train.steps_per_save=3000" \
--gin_param="trainers.Trainer.checkpoints_to_keep=10" \
--gin_param="TFRecordProvider.example_secs=4" \
--gin_param="TFRecordProvider.sample_rate=48000" \
--gin_param="TFRecordProvider.frame_rate=250" \
--gin_param="Additive.n_samples=192000" \
--gin_param="Additive.sample_rate=48000" \
--gin_param="FilteredNoise.n_samples=192000"
Update: I've found that running with only a single GPU (via CUDA_VISIBLE_DEVICES=0) does work to train with the PretrainedCREPEEmbeddingLoss.
Is there a way to allow the PretrainedCREPEEmbeddingLoss to work with multi-gpu training?
Hello, I've trained a model for a while using the solo_instrument config at 48 kHz, but the audio is still fairly noisy even after 117k steps (spectral loss is ~9 on average).
I'd like to continue training with the PretrainedCREPEEmbeddingLoss() enabled as well to encourage more natural / perceptually realistic synthesis.
I've tried just adding the loss into the ae.gin file, but get the following error which I don't really understand:
How can I train with this loss enabled?