Error while Training - Githubissues

osok commented 3 months ago

I have a mono 22050 wav file that is 50 minutes long. I used (after fixing) LJSpeechToolSet to convert to LJSpeech format. Then used piper to do the preprocessing. That works. When I try to train I get this error:

Traceback (most recent call last):
  File "/home/michael/anaconda3/envs/piper-3.9.0/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/michael/anaconda3/envs/piper-3.9.0/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/ai/michael/voice/piper/src/python/piper_train/__main__.py", line 7, in <module>
    from pytorch_lightning import Trainer
  File "/home/michael/anaconda3/envs/piper-3.9.0/lib/python3.9/site-packages/pytorch_lightning/__init__.py", line 34, in <module>
    from pytorch_lightning.callbacks import Callback  # noqa: E402
  File "/home/michael/anaconda3/envs/piper-3.9.0/lib/python3.9/site-packages/pytorch_lightning/callbacks/__init__.py", line 25, in <module>
    from pytorch_lightning.callbacks.progress import ProgressBarBase, RichProgressBar, TQDMProgressBar
  File "/home/michael/anaconda3/envs/piper-3.9.0/lib/python3.9/site-packages/pytorch_lightning/callbacks/progress/__init__.py", line 22, in <module>
    from pytorch_lightning.callbacks.progress.rich_progress import RichProgressBar  # noqa: F401
  File "/home/michael/anaconda3/envs/piper-3.9.0/lib/python3.9/site-packages/pytorch_lightning/callbacks/progress/rich_progress.py", line 20, in <module>
    from torchmetrics.utilities.imports import _compare_version
ImportError: cannot import name '_compare_version' from 'torchmetrics.utilities.imports' (/home/michael/anaconda3/envs/piper-3.9.0/lib/python3.9/site-packages/torchmetrics/utilities/imports.py)

I have tried using in a conda environment with python 3.10 amd 3.9 (3.8 piper would not even in stall). I also tried to use the venv per original instructions.

I have a dual Nvidia 4090 GPU computer

My training script is:

python3 -m piper_train \
    -- dataset-dir /ai/michael/voice/piper-out \
    --accelerator 'gpu' \
    --devices 2 \
    --batch-size 64 \
    --validation-split 0.0 \
    --num-test-examples 0 \
    --max_epochs 10000 \
    --resume_from_checkpoint /ai/michael/voice/piper-checkpoint/high/epoch=2218-step=838782.ckpt \
    --checkpoint-epochs 1 \
    --precision 32 \
    --quality high

I also tried ith medium checkpoint and that gets the same.

tanfarou commented 3 months ago

Try installing python3 -m pip install torchmetrics==0.11.4

osok commented 3 months ago

Thank you that fixed that problem

rhasspy / piper

Error while Training #381