Open ken2190 opened 1 month ago
i follow the instruction on this colab to enable saving multiple checkpoint, but it seem doesn't work. Does anyone has solved this issue?
https://colab.research.google.com/github/rmcpantoja/piper/blob/master/notebooks/piper_multilingual_training_notebook.ipynb#scrollTo=ickQlOCRjkBL
python -m piper_train \ --dataset-dir "/home/ubuntu/DATA/piper/experiment/single_spk/" \ --accelerator 'gpu' \ --devices 4 \ --batch-size 26 \ --validation-split 0.05 \ --num-test-examples 10 \ --quality "high" \ --checkpoint-epochs 5 \ --log_every_n_steps 50 \ --max_epochs 5000 \ --resume_from_checkpoint "/home/ubuntu/DATA/piper/piper-checkpoints/en/en_US/lessac/high/epoch=2218-step=838782.ckpt" \ --precision 32 \ --gpus='0,1,2,3' \ --strategy=ddp \ --num_ckpt 1
Error:
python -m piper_train \ --max_epochs 5000 \ --resume_from_single_speaker_checkpoint "/home/ubuntu/DATA/piper/piper-checkpoints/en/en_US/lessac/high/epoch=2218-step=838782.ck pt" \ --precision 32 \ --gpus='0,1,2,3' \ --strategy=ddp \ --num_ckpt 1> --dataset-dir "/home/ubuntu/DATA/piper/experiment/single_spk/" \ > --accelerator 'gpu' \ > --devices 4 \ > --batch-size 26 \ > --validation-split 0.0 \ > --num-test-examples 2 \ > --quality "high" \ > --checkpoint-epochs 5 \ > --log_every_n_steps 50 \ > --max_epochs 5000 \ > --resume_from_single_speaker_checkpoint "/home/ubuntu/DATA/piper/piper-checkpoints/en/en_US/lessac/high/epoch=2218-step=838782.ckpt" \ > --precision 32 \ > --gpus='0,1,2,3' \ > --strategy=ddp \ > --num_ckpt 1 usage: __main__.py [-h] --dataset-dir DATASET_DIR [--checkpoint-epochs CHECKPOINT_EPOCHS] [--quality {x-low,medium,high}] [--resume_from_single_speaker_checkpoint RESUME_FROM_SINGLE_SPEAKER_CHECKPOINT] [--logger [LOGGER]] [--enable_checkpointing [ENABLE_CHECKPOINTING]] [--default_root_dir DEFAULT_ROOT_DIR] [--gradient_clip_val GRADIENT_CLIP_VAL] [--gradient_clip_algorithm GRADIENT_CLIP_ALGORITHM] [--num_nodes NUM_NODES] [--num_processes NUM_PROCESSES] [--devices DEVICES] [--gpus GPUS] [--auto_select_gpus [AUTO_SELECT_GPUS]] [--tpu_cores TPU_CORES] [--ipus IPUS] [--enable_progress_bar [ENABLE_PROGRESS_BAR]] [--overfit_batches OVERFIT_BATCHES] [--track_grad_norm TRACK_GRAD_NORM] [--check_val_every_n_epoch CHECK_VAL_EVERY_N_EPOCH] [--fast_dev_run [FAST_DEV_RUN]] [--accumulate_grad_batches ACCUMULATE_GRAD_BATCHES] [--max_epochs MAX_EPOCHS] [--min_epochs MIN_EPOCHS] [--max_steps MAX_STEPS] [--min_steps MIN_STEPS] [--max_time MAX_TIME] [--limit_train_batches LIMIT_TRAIN_BATCHES] [--limit_val_batches LIMIT_VAL_BATCHES] [--limit_test_batches LIMIT_TEST_BATCHES] [--limit_predict_batches LIMIT_PREDICT_BATCHES] [--val_check_interval VAL_CHECK_INTERVAL] [--log_every_n_steps LOG_EVERY_N_STEPS] [--accelerator ACCELERATOR] [--strategy STRATEGY] [--sync_batchnorm [SYNC_BATCHNORM]] [--precision PRECISION] [--enable_model_summary [ENABLE_MODEL_SUMMARY]] [--weights_save_path WEIGHTS_SAVE_PATH] [--num_sanity_val_steps NUM_SANITY_VAL_STEPS] [--resume_from_checkpoint RESUME_FROM_CHECKPOINT] [--profiler PROFILER] [--benchmark [BENCHMARK]] [--deterministic [DETERMINISTIC]] [--reload_dataloaders_every_n_epochs RELOAD_DATALOADERS_EVERY_N_EPOCHS] [--auto_lr_find [AUTO_LR_FIND]] [--replace_sampler_ddp [REPLACE_SAMPLER_DDP]] [--detect_anomaly [DETECT_ANOMALY]] [--auto_scale_batch_size [AUTO_SCALE_BATCH_SIZE]] [--plugins PLUGINS] [--amp_backend AMP_BACKEND] [--amp_level AMP_LEVEL] [--move_metrics_to_cpu [MOVE_METRICS_TO_CPU]] [--multiple_trainloader_mode MULTIPLE_TRAINLOADER_MODE] --batch-size BATCH_SIZE [--validation-split VALIDATION_SPLIT] [--num-test-examples NUM_TEST_EXAMPLES] [--max-phoneme-ids MAX_PHONEME_IDS] [--hidden-channels HIDDEN_CHANNELS] [--inter-channels INTER_CHANNELS] [--filter-channels FILTER_CHANNELS] [--n-layers N_LAYERS] [--n-heads N_HEADS] [--seed SEED] __main__.py: error: unrecognized arguments: --num_ckpt 1
i follow the instruction on this colab to enable saving multiple checkpoint, but it seem doesn't work. Does anyone has solved this issue?
https://colab.research.google.com/github/rmcpantoja/piper/blob/master/notebooks/piper_multilingual_training_notebook.ipynb#scrollTo=ickQlOCRjkBL
Interval to save best k models:
Set to 0 if you want to disable saving multiple models. If this is the case, check the checkbox below.
If set to 1, models will be saved with the file name epoch=xx-step=xx.ckpt, so you will need to empty Drive's trash every so often.
Error: