Open ignacio82 opened 11 months ago
Have you installed the NVIDIA container toolkit? Does the sample NVIDIA container run and detect the GPU?
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
ignacio@xps:~$ docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
Unable to find image 'ubuntu:latest' locally
latest: Pulling from library/ubuntu
5e8117c0bd28: Pull complete
Digest: sha256:8eab65df33a6de2844c9aefd19efe8ddb87b7df5e9185a4ab73af936225685bb
Status: Downloaded newer image for ubuntu:latest
Tue Dec 12 03:14:08 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce GTX 1060 6GB Off | 00000000:01:00.0 On | N/A |
| 43% 40C P0 24W / 120W | 587MiB / 6144MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
+---------------------------------------------------------------------------------------+
In your original post, I don't see "--runtime=nvidia --gpus all" in your docker command
$ docker run --runtime=nvidia --gpus all -ti -v /home/ignacio/out-train:/train -v /home/ignacio/piper-checkpoints:/piper-checkpoints cloning bash
=============
== PyTorch ==
=============
NVIDIA Release 22.03 (build 33569136)
PyTorch Version 1.12.0a0+2c916ef
Container image Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Copyright (c) 2014-2022 Facebook Inc.
Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert)
Copyright (c) 2012-2014 Deepmind Technologies (Koray Kavukcuoglu)
Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu)
Copyright (c) 2011-2013 NYU (Clement Farabet)
Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston)
Copyright (c) 2006 Idiap Research Institute (Samy Bengio)
Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz)
Copyright (c) 2015 Google Inc.
Copyright (c) 2015 Yangqing Jia
Copyright (c) 2013-2016 The Caffe contributors
All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
NOTE: The SHMEM allocation limit is set to the default of 64MB. This may be
insufficient for PyTorch. NVIDIA recommends the use of the following flags:
docker run --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 ...
root@ac304033e8fc:/workspace# python3 -m piper_train \
> > --dataset-dir /train/ \
> > --accelerator 'gpu' \
> > --devices 1 \
> > --batch-size 32 \
> > --validation-split 0.0 \
> > --num-test-examples 0 \
> > --max_epochs 10000 \
> > --resume_from_checkpoint /piper-checkpoints/epoch=4641-step=3104302.ckpt \
> > --checkpoint-epochs 1 \
> > --precision 32
/opt/conda/bin/python3: No module named piper_train
root@ac304033e8fc:/workspace#
No difference
Me personally, I didn't see much advantage to running training in docker so I just ran the training in my bare metal host OS. But I was curious why you're having so much trouble. Your docker command shows that you're running a docker container called cloning. My guess is that container doesn't have piper installed so, as expected, you can't run piper_train in that container. You could either install piper modules into that container interactively or build a new container.
I looked at TRAINING.md and it recommends the following:
It is highly recommended to train with the following Dockerfile:
FROM nvcr.io/nvidia/pytorch:22.03-py3
RUN pip3 install 'pytorch-lightning'
ENV NUMBA_CACHE_DIR=.numba_cache
But this seems incomplete to me too, since the container built from that Dockerfile also doesn't have the tools installed. If you build a container from the above Dockerfile, run it and do pip list from the command line, you'll see there's no piper-* modules.
My guess is, the 3 lines in the above Dockerfile will not build a container that's ready to do piper training. There are numerous ways to fix this. Here's a couple:
Either way, once the piper environment is configured in the container, you should be able to run piper_train in your container.
re baremetal: I tried, and got errors, so I was hoping doing it inside a container would resolve that. re Dockerfile: Yes, that is one of the issues with the documentation I think. I tried modifying the Dockerfile as follows but I'm clearly missing something:
FROM nvcr.io/nvidia/pytorch:22.03-py3
# Install system dependencies
RUN apt-get update && apt-get install -y python3-dev \
&& rm -rf /var/lib/apt/lists/*
# Clone the repository
RUN git clone --depth 1 https://github.com/rhasspy/piper.git /piper
# Set the working directory
WORKDIR /piper/src/python
# Create and activate virtual environment
RUN python3 -m venv .venv \
&& . .venv/bin/activate \
&& pip install --upgrade pip setuptools wheel \
&& pip install -e . 'pytorch-lightning'
# Deactivate the virtual environment
RUN deactivate
# Set environment variable
ENV NUMBA_CACHE_DIR=.numba_cache
$ docker build . -t cloning
...
11.19 ERROR: No matching distribution found for piper-phonemize~=1.1.0
------
Dockerfile:14
--------------------
13 | # Create and activate virtual environment
14 | >>> RUN python3 -m venv .venv \
15 | >>> && . .venv/bin/activate \
16 | >>> && pip install --upgrade pip setuptools wheel \
17 | >>> && pip install -e . 'pytorch-lightning'
18 |
--------------------
ERROR: failed to solve: process "/bin/sh -c python3 -m venv .venv && . .venv/bin/activate && pip install --upgrade pip setuptools wheel && pip install -e . 'pytorch-lightning'" did not complete successfully: exit code: 1
Well the TRAINING.md Dockerfile example starts with:
FROM nvcr.io/nvidia/pytorch:22.03-py3
And if you run that container, and run python --version, you'll see it's python 3.8, which I know from experience can't build the piper modules. I've settled on using Python 3.10 since 3.9, 3.11 and 3.12 all didn't work for me.
This error specifically: 11.19 ERROR: No matching distribution found for piper-phonemize~=1.1.0
Seems to happen with python 3.8
I've created a new Dockerfile which installs Python 3.10. It's still building, but once it does, I can verify that piper_train runs properly with GPU support.
Ok, that worked after rebuilding the container with python 3.10. Specifically I added the following in the Dockerfile before installing piper or creating the venv:
RUN mkdir -pv /usr/src/python
WORKDIR /usr/src/python
RUN wget https://www.python.org/ftp/python/3.10.13/Python-3.10.13.tgz
RUN tar zxvf Python-3.10.13.tgz
RUN apt-get update && apt install -y libffi-dev && rm -rf /var/lib/apt/lists/*
WORKDIR /usr/src/python/Python-3.10.13
RUN ./configure --enable-optimizations
RUN make -j8
RUN make altinstall
WORKDIR /usr/src/piper/src/python
RUN /usr/local/bin/python3.10 -m venv .venv
RUN source .venv/bin/activate && pip list && pip install pip wheel setuptools -U && pip list && pip install -r requirements.txt && pip list && pip install -e . && pip list && pip install torchmetrics==0.11.4 && pip install piper-tts && ./build_monotonic_align.sh && pip3 install piper-tts
Earlier in the Dockerfile I install some other apt pacjages so that python compiles, those include:
espeak-ng git build-essential zlib1g-dev libbz2-dev liblzma-dev libncurses5-dev libreadline6-dev libsqlite3-dev libssl-dev libgdbm-dev liblzma-dev tk-dev lzma lzma-dev libgdbm-dev
Some of those should be removed after python is compiled, to reduce the size of the container.
After the above modifications, the container sees the GPU and piper_train loads correctly. I didn't do any training since the GPU is already busy.
any chance you can push that container to docker-hub?
I'm stuck here:
ignacio@xps:~/piper-checkpoints$ docker run -ti cloning bash
=============
== PyTorch ==
=============
NVIDIA Release 22.03 (build 33569136)
PyTorch Version 1.12.0a0+2c916ef
Container image Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Copyright (c) 2014-2022 Facebook Inc.
Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert)
Copyright (c) 2012-2014 Deepmind Technologies (Koray Kavukcuoglu)
Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu)
Copyright (c) 2011-2013 NYU (Clement Farabet)
Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston)
Copyright (c) 2006 Idiap Research Institute (Samy Bengio)
Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz)
Copyright (c) 2015 Google Inc.
Copyright (c) 2015 Yangqing Jia
Copyright (c) 2013-2016 The Caffe contributors
All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.
Use the NVIDIA Container Toolkit to start this container with GPU support; see
https://docs.nvidia.com/datacenter/cloud-native/ .
NOTE: The SHMEM allocation limit is set to the default of 64MB. This may be
insufficient for PyTorch. NVIDIA recommends the use of the following flags:
docker run --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 ...
root@fa766fb67ab7:/usr/src/piper/src/python# /usr/local/bin/python3.10 -m venv .venv
root@fa766fb67ab7:/usr/src/piper/src/python# source .venv/bin/activate && pip list && pip install pip wheel setuptools -U && pip list && pip install -r requirements.txt && pip list && pip install -e . && pip list && pip install torchmetrics==0.11.4 && pip install piper-tts && ./build_monotonic_align.sh && pip3 install piper-tts
Package Version
---------- -------
pip 23.0.1
setuptools 65.5.0
[notice] A new release of pip is available: 23.0.1 -> 23.3.1
[notice] To update, run: pip install --upgrade pip
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: pip in ./.venv/lib/python3.10/site-packages (23.0.1)
Collecting pip
Downloading pip-23.3.1-py3-none-any.whl (2.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 1.8 MB/s eta 0:00:00
Collecting wheel
Downloading wheel-0.42.0-py3-none-any.whl (65 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 65.4/65.4 kB 5.0 MB/s eta 0:00:00
Requirement already satisfied: setuptools in ./.venv/lib/python3.10/site-packages (65.5.0)
Collecting setuptools
Downloading setuptools-69.0.2-py3-none-any.whl (819 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 819.5/819.5 kB 2.9 MB/s eta 0:00:00
Installing collected packages: wheel, setuptools, pip
Attempting uninstall: setuptools
Found existing installation: setuptools 65.5.0
Uninstalling setuptools-65.5.0:
Successfully uninstalled setuptools-65.5.0
Attempting uninstall: pip
Found existing installation: pip 23.0.1
Uninstalling pip-23.0.1:
Successfully uninstalled pip-23.0.1
Successfully installed pip-23.3.1 setuptools-69.0.2 wheel-0.42.0
Package Version
---------- -------
pip 23.3.1
setuptools 69.0.2
wheel 0.42.0
ERROR: Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt'
And this is what I currently have for the Dockerfile:
FROM nvcr.io/nvidia/pytorch:22.03-py3
# Set environment variables
ENV NUMBA_CACHE_DIR=.numba_cache
ENV TZ=America/Los_Angeles
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
# Install system dependencies
RUN apt-get update \
&& apt-get install -y \
espeak-ng \
git \
build-essential \
zlib1g-dev \
libbz2-dev \
liblzma-dev \
libncurses5-dev \
libreadline6-dev \
libsqlite3-dev \
libssl-dev \
libgdbm-dev \
liblzma-dev \
lzma \
lzma-dev \
libgdbm-dev
RUN DEBIAN_FRONTEND=noninteractive TZ=$TZ apt-get -y install tzdata \
&& DEBIAN_FRONTEND=noninteractive TZ=$TZ apt-get -y install tzdata \
&& rm -rf /var/lib/apt/lists/*
# Install Python
ARG PYTHON_VERSION=3.10.13
RUN mkdir -pv /usr/src/python \
&& cd /usr/src/python \
&& wget https://www.python.org/ftp/python/${PYTHON_VERSION}/Python-${PYTHON_VERSION}.tgz \
&& tar zxvf Python-${PYTHON_VERSION}.tgz \
&& cd Python-${PYTHON_VERSION} \
&& ./configure --enable-optimizations \
&& make -j8 \
&& make altinstall \
&& cd / \
&& rm -rf /usr/src/python
# Set working directory
WORKDIR /usr/src/piper/src/python
I don't see a git clone for the piper github repo in your docker file
I didn't see that in the code you shared, nor i can figure out how to incorporate it in my Dockefile that I'm trying to put together. This is clearly above my head. I will keep my fingers crossed for a public container, Dockerfile, or improved documentation. Thanks for the help
FROM nvcr.io/nvidia/pytorch:22.03-py3
RUN pip3 install 'pytorch-lightning'
ENV NUMBA_CACHE_DIR=.numba_cache
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update && apt install -y python3-dev python3-venv espeak-ng git build-essential zlib1g-dev libbz2-dev liblzma-dev libncurses5-dev libreadline6-dev libsqlite3-dev libssl-dev libgdbm-dev liblzma-dev tk-dev lzma lzma-dev libgdbm-dev libffi-dev && rm -rf /var/lib/apt/lists/*
RUN mkdir -pv /usr/src/
WORKDIR /usr/src/
RUN git clone https://github.com/rhasspy/piper.git
RUN mkdir -pv /usr/src/python
WORKDIR /usr/src/python
RUN wget https://www.python.org/ftp/python/3.10.13/Python-3.10.13.tgz
RUN tar zxvf Python-3.10.13.tgz
WORKDIR /usr/src/python/Python-3.10.13
RUN ./configure --enable-optimizations
RUN make -j8
RUN make altinstall
WORKDIR /usr/src/piper/src/python
RUN /usr/local/bin/python3.10 -m venv .venv
RUN source .venv/bin/activate && pip list && pip install pip wheel setuptools -U && pip list && pip install -r requirements.txt && pip list && pip install -e . && pip list && pip install torchmetrics==0.11.4 && pip install piper-tts && ./build_monotonic_align.sh && pip3 install piper-tts
RUN ln -fs /usr/share/zoneinfo/America/Chicago /etc/localtime
I used to think docker was amazing for people like me, this experience has changed my mind:
178.3 pip._vendor.urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out.
------
Dockerfile:19
--------------------
17 | WORKDIR /usr/src/piper/src/python
18 | RUN /usr/local/bin/python3.10 -m venv .venv
19 | >>> RUN source .venv/bin/activate && pip list && pip install pip wheel setuptools -U && pip list && pip install -r requirements.txt && pip list && pip install -e . && pip list && pip install torchmetrics==0.11.4 && pip install piper-tts && ./build_monotonic_align.sh && pip3 install piper-tts
20 | RUN ln -fs /usr/share/zoneinfo/America/Chicago /etc/localtime
21 |
--------------------
ERROR: failed to solve: process "/bin/sh -c source .venv/bin/activate && pip list && pip install pip wheel setuptools -U && pip list && pip install -r requirements.txt && pip list && pip install -e . && pip list && pip install torchmetrics==0.11.4 && pip install piper-tts && ./build_monotonic_align.sh && pip3 install piper-tts" did not complete successfully: exit code: 2
This seems like a transient issue. They happen. Did you try running the container build again?
Tried to build it once more time, without changing anything, and it worked. Alas:
root@3f2a8ced13e2:/usr/src/piper/src/python# python3 -m piper_train \
> > --dataset-dir /train/ \
> > --accelerator 'gpu' \
> > --devices 1 \
> > --batch-size 32 \
> > --validation-split 0.0 \
> > --num-test-examples 0 \
> > --max_epochs 10000 \
> > --resume_from_checkpoint /piper-checkpoints/epoch=4641-step=3104302.ckpt \
> > --checkpoint-epochs 1 \
> > --precision 32
/opt/conda/lib/python3.8/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: /opt/conda/lib/python3.8/site-packages/torchvision/image.so: undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
warn(f"Failed to load image Python extension: {e}")
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/conda/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/usr/src/piper/src/python/piper_train/__main__.py", line 10, in <module>
from .vits.lightning import VitsModel
File "/usr/src/piper/src/python/piper_train/vits/lightning.py", line 15, in <module>
from .models import MultiPeriodDiscriminator, SynthesizerTrn
File "/usr/src/piper/src/python/piper_train/vits/models.py", line 10, in <module>
from . import attentions, commons, modules, monotonic_align
File "/usr/src/piper/src/python/piper_train/vits/monotonic_align/__init__.py", line 4, in <module>
from .monotonic_align.core import maximum_path_c
ModuleNotFoundError: No module named 'piper_train.vits.monotonic_align.monotonic_align.core'
root@3f2a8ced13e2:/usr/src/piper/src/python#
While in the container, before running "python3 -m piper_train", try running
source .venv/bin/activate
Thanks. Getting closser I think:
(.venv) root@3f2a8ced13e2:/usr/src/piper/src/python# python3 -m piper_train \
> --dataset-dir /train/ \
> --accelerator 'gpu' \
> --devices 1 \
> --batch-size 32 \
> --validation-split 0.0 \
> --num-test-examples 0 \
> --max_epochs 10000 \
> --resume_from_checkpoint /piper-checkpoints/epoch=4641-step=3104302.ckpt \
> --checkpoint-epochs 1 \
> --precision 32
DEBUG:piper_train:Namespace(dataset_dir='/train/', checkpoint_epochs=1, quality='medium', resume_from_single_speaker_checkpoint=None, logger=True, enable_checkpointing=True, default_root_dir=None, gradient_clip_val=None, gradient_clip_algorithm=None, num_nodes=1, num_processes=None, devices='1', gpus=None, auto_select_gpus=False, tpu_cores=None, ipus=None, enable_progress_bar=True, overfit_batches=0.0, track_grad_norm=-1, check_val_every_n_epoch=1, fast_dev_run=False, accumulate_grad_batches=None, max_epochs=10000, min_epochs=None, max_steps=-1, min_steps=None, max_time=None, limit_train_batches=None, limit_val_batches=None, limit_test_batches=None, limit_predict_batches=None, val_check_interval=None, log_every_n_steps=50, accelerator='gpu', strategy=None, sync_batchnorm=False, precision=32, enable_model_summary=True, weights_save_path=None, num_sanity_val_steps=2, resume_from_checkpoint='/piper-checkpoints/epoch=4641-step=3104302.ckpt', profiler=None, benchmark=None, deterministic=None, reload_dataloaders_every_n_epochs=0, auto_lr_find=False, replace_sampler_ddp=True, detect_anomaly=False, auto_scale_batch_size=False, plugins=None, amp_backend='native', amp_level=None, move_metrics_to_cpu=False, multiple_trainloader_mode='max_size_cycle', batch_size=32, validation_split=0.0, num_test_examples=0, max_phoneme_ids=None, hidden_channels=192, inter_channels=192, filter_channels=768, n_layers=6, n_heads=2, seed=1234)
/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py:52: LightningDeprecationWarning: Setting `Trainer(resume_from_checkpoint=)` is deprecated in v1.5 and will be removed in v1.7. Please pass `Trainer.fit(ckpt_path=)` directly instead.
rank_zero_deprecation(
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
DEBUG:piper_train:Checkpoints will be saved every 1 epoch(s)
DEBUG:vits.dataset:Loading dataset: /train/dataset.jsonl
/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py:731: LightningDeprecationWarning: `trainer.resume_from_checkpoint` is deprecated in v1.5 and will be removed in v2.0. Specify the fit checkpoint path with `trainer.fit(ckpt_path=)` instead.
ckpt_path = ckpt_path or self.resume_from_checkpoint
Missing logger folder: /train/lightning_logs
Restoring states from the checkpoint path at /piper-checkpoints/epoch=4641-step=3104302.ckpt
DEBUG:fsspec.local:open file: /piper-checkpoints/epoch=4641-step=3104302.ckpt
/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/callbacks/model_checkpoint.py:345: UserWarning: The dirpath has changed from '/home/hansenm/larynx2/local/en_US/ryan/medium/lightning_logs/version_0/checkpoints' to '/train/lightning_logs/version_0/checkpoints', therefore `best_model_score`, `kth_best_model_path`, `kth_value`, `last_model_path` and `best_k_models` won't be reloaded. Only `best_model_path` will be reloaded.
warnings.warn(
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
DEBUG:fsspec.local:open file: /train/lightning_logs/version_0/hparams.yaml
Restored all states from the checkpoint file at /piper-checkpoints/epoch=4641-step=3104302.ckpt
/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/utilities/data.py:153: UserWarning: Total length of `DataLoader` across ranks is zero. Please make sure this was your intention.
rank_zero_warn(
/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:236: PossibleUserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 12 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
rank_zero_warn(
/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py:1892: PossibleUserWarning: The number of training batches (2) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.
rank_zero_warn(
Traceback (most recent call last):
File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/usr/src/piper/src/python/piper_train/__main__.py", line 147, in <module>
main()
File "/usr/src/piper/src/python/piper_train/__main__.py", line 124, in main
trainer.fit(model)
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 696, in fit
self._call_and_handle_interrupt(
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 650, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 735, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1166, in _run
results = self._run_stage()
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1252, in _run_stage
return self._run_train()
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1283, in _run_train
self.fit_loop.run()
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 271, in advance
self._outputs = self.epoch_loop.run(self._data_fetcher)
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 174, in advance
batch = next(data_fetcher)
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/utilities/fetching.py", line 184, in __next__
return self.fetching_function()
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/utilities/fetching.py", line 263, in fetching_function
self._fetch_next_batch(self.dataloader_iter)
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/utilities/fetching.py", line 277, in _fetch_next_batch
batch = next(iterator)
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/supporters.py", line 557, in __next__
return self.request_next_batch(self.loader_iters)
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/supporters.py", line 569, in request_next_batch
return apply_to_collection(loader_iters, Iterator, next)
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/utilities/apply_func.py", line 99, in apply_to_collection
return function(data, *args, **kwargs)
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 628, in __next__
data = self._next_data()
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1333, in _next_data
return self._process_data(data)
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1359, in _process_data
data.reraise()
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/torch/_utils.py", line 543, in reraise
raise exception
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 58, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/torch/utils/data/dataset.py", line 295, in __getitem__
return self.dataset[self.indices[idx]]
File "/usr/src/piper/src/python/piper_train/vits/dataset.py", line 80, in __getitem__
audio_norm=torch.load(utt.audio_norm_path),
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/torch/serialization.py", line 771, in load
with _open_file_like(f, 'rb') as opened_file:
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/torch/serialization.py", line 270, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/torch/serialization.py", line 251, in __init__
super(_open_file, self).__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: '/home/ignacio/out-train/cache/22050/be0533dcc16fb185a824504edb199d5862d70203c43ec4f0c45e3fa133a9b857.pt'
I don't understand the error given that i'm running that command inside the container and I'm running this to get in the container:
ignacio@xps:~/piper-checkpoints$ docker run --runtime=nvidia --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -ti -v /home/ignacio/out-train:/train -v /home/ignacio/piper-checkpoints:/piper-checkpoints cloning bash
ignacio@xps:~/piper-checkpoints$ ls /home/ignacio/out-train
cache config.json dataset.jsonl lightning_logs
ignacio@xps:~/piper-checkpoints$ ls /home/ignacio/out-train/cache/
22050
In case it is relevant:
ignacio@xps:~/piper-checkpoints$ tree /home/ignacio/out-train
locales-launch: Data of en_US locale not found, generating, please wait...
/home/ignacio/out-train
├── cache
│ └── 22050
│ ├── 04e01d02ea9d846ce5227caa79557650cd174774711a5a6aa97e0ca367aadcb5.pt
│ ├── 04e01d02ea9d846ce5227caa79557650cd174774711a5a6aa97e0ca367aadcb5.spec.pt
│ ├── 06cd4ef3b72e9632fa844f1ec24a1426f49e8f674b4cce04a08e5b2807bc552b.pt
│ ├── 06cd4ef3b72e9632fa844f1ec24a1426f49e8f674b4cce04a08e5b2807bc552b.spec.pt
│ ├── 0948075669e629c7d9d6fbeea0e78df1f4b51de2787171fdf9d7f151e051a0ba.pt
│ ├── 0948075669e629c7d9d6fbeea0e78df1f4b51de2787171fdf9d7f151e051a0ba.spec.pt
│ ├── 0df1b17ffb19d9ef36be002010eaeae27096c730e7ed2595d91d15b110861354.pt
│ ├── 0df1b17ffb19d9ef36be002010eaeae27096c730e7ed2595d91d15b110861354.spec.pt
│ ├── 11a4355e2d69348d330a8226a7af766b36ecf8f3d68efd7722fa22330405cac3.pt
│ ├── 11a4355e2d69348d330a8226a7af766b36ecf8f3d68efd7722fa22330405cac3.spec.pt
│ ├── 1a9b151680072e8f1c412c7542d67e3579ddb735260a4f38b4418a0636261af1.pt
│ ├── 1a9b151680072e8f1c412c7542d67e3579ddb735260a4f38b4418a0636261af1.spec.pt
│ ├── 2024851fc407962a29a90e876e1bd73e2434c507505c3b40bead0f5c3b5f2dac.pt
│ ├── 2024851fc407962a29a90e876e1bd73e2434c507505c3b40bead0f5c3b5f2dac.spec.pt
│ ├── 2cb99059aef28b0c877203e021a85ae1d2cd55588e2b8bd1a5c98706ac6c17ac.pt
│ ├── 2cb99059aef28b0c877203e021a85ae1d2cd55588e2b8bd1a5c98706ac6c17ac.spec.pt
│ ├── 303708369840cf8172919b6e0586a5538a5a20dc1480fe756e82b56c74ef96d4.pt
│ ├── 303708369840cf8172919b6e0586a5538a5a20dc1480fe756e82b56c74ef96d4.spec.pt
│ ├── 30fd0aa0d27f70e709471161c9cf3cc95f7d4a16d146b8c13c4b4a409fbc13b5.pt
│ ├── 30fd0aa0d27f70e709471161c9cf3cc95f7d4a16d146b8c13c4b4a409fbc13b5.spec.pt
│ ├── 360c548aadf51aad829e007e40615917f51494bf1716218cf5291a3a0da84736.pt
│ ├── 360c548aadf51aad829e007e40615917f51494bf1716218cf5291a3a0da84736.spec.pt
│ ├── 3f57866c2d2e0640411413403a9a9bdbbd15bdfc0ccf6d7ffbcd67c0f4ee6ee7.pt
│ ├── 3f57866c2d2e0640411413403a9a9bdbbd15bdfc0ccf6d7ffbcd67c0f4ee6ee7.spec.pt
│ ├── 49c2832bef4fc31133f4d38888645d24908a3f6a1a58079f803e80b012397dd7.pt
│ ├── 49c2832bef4fc31133f4d38888645d24908a3f6a1a58079f803e80b012397dd7.spec.pt
│ ├── 4e6ba182a31950b9917c19bde3aec13036c7ffcf54f67fbdf5694596ceeede1c.pt
│ ├── 4e6ba182a31950b9917c19bde3aec13036c7ffcf54f67fbdf5694596ceeede1c.spec.pt
│ ├── 5427257a73627162aff7d19e6d112a98c30a39ba2e663e0e975f35b53189e06e.pt
│ ├── 5427257a73627162aff7d19e6d112a98c30a39ba2e663e0e975f35b53189e06e.spec.pt
│ ├── 548272f6eb8d4a860a03b0d10c98de7a19bc0ff806b88efd4f3c256c0444357c.pt
│ ├── 548272f6eb8d4a860a03b0d10c98de7a19bc0ff806b88efd4f3c256c0444357c.spec.pt
│ ├── 57b4b77b8004ce6c31edba7991b3a20b804d6c4e02cccacbc80723c122f3d9d9.pt
│ ├── 57b4b77b8004ce6c31edba7991b3a20b804d6c4e02cccacbc80723c122f3d9d9.spec.pt
│ ├── 598a03603d426674dcf3e68b323c99517ccda4b7240bc064f5610dabf0f7e66c.pt
│ ├── 598a03603d426674dcf3e68b323c99517ccda4b7240bc064f5610dabf0f7e66c.spec.pt
│ ├── 5a3c38fdfcb352a6c1e780983278f8b30bedb9eade064118bb69a7136924c636.pt
│ ├── 5a3c38fdfcb352a6c1e780983278f8b30bedb9eade064118bb69a7136924c636.spec.pt
│ ├── 5c8fc290cd50260d9b1cade5b85256ae7d2897fd672163f4b0dfa4da5210d73f.pt
│ ├── 5c8fc290cd50260d9b1cade5b85256ae7d2897fd672163f4b0dfa4da5210d73f.spec.pt
│ ├── 5d901cbf53cee1434c263d70324654676fbf4bd821f4279592685b7b48634008.pt
│ ├── 5d901cbf53cee1434c263d70324654676fbf4bd821f4279592685b7b48634008.spec.pt
│ ├── 66b04792e1fbde0e55b161e5056bb93d7a148684bf66bf304ec5ff16f5ad25da.pt
│ ├── 66b04792e1fbde0e55b161e5056bb93d7a148684bf66bf304ec5ff16f5ad25da.spec.pt
│ ├── 6ec3592ca3a7e08caef2351aa53f2e796f921d870a1417d1edf73b1b45722b6b.pt
│ ├── 6ec3592ca3a7e08caef2351aa53f2e796f921d870a1417d1edf73b1b45722b6b.spec.pt
│ ├── 7179e033dd036c6957e93fbfe51841b2f8018088ada9185875505cab004a2b85.pt
│ ├── 7179e033dd036c6957e93fbfe51841b2f8018088ada9185875505cab004a2b85.spec.pt
│ ├── 7219bd4dcf0d8b7c481f1dfc38b3396ad0b7e8a7fd73a0019599f04d474d7517.pt
│ ├── 7219bd4dcf0d8b7c481f1dfc38b3396ad0b7e8a7fd73a0019599f04d474d7517.spec.pt
│ ├── 788360cedf32e6b73f98274883e853446315060699fb5a9e38e43cd881137bcd.pt
│ ├── 788360cedf32e6b73f98274883e853446315060699fb5a9e38e43cd881137bcd.spec.pt
│ ├── 8144dc825ab98c874ef587abc42b6b713b4d0974365274da61230df0f73a8fd8.pt
│ ├── 8144dc825ab98c874ef587abc42b6b713b4d0974365274da61230df0f73a8fd8.spec.pt
│ ├── 8e6fcd06ff034ad9a5ab4727db696ae214609f1cdb8f3d70901b02870db87fba.pt
│ ├── 8e6fcd06ff034ad9a5ab4727db696ae214609f1cdb8f3d70901b02870db87fba.spec.pt
│ ├── 9faad2f2db9feda31a0cd610efc8ba71a8c71576fc8036b74682968956967233.pt
│ ├── 9faad2f2db9feda31a0cd610efc8ba71a8c71576fc8036b74682968956967233.spec.pt
│ ├── a3e1c79bdf950cba0c5a8158b1af9f0e738baa7a563290c3f9111cb6acfbf8c6.pt
│ ├── a3e1c79bdf950cba0c5a8158b1af9f0e738baa7a563290c3f9111cb6acfbf8c6.spec.pt
│ ├── a3f27635e5976f2f7040683193089c4ba7811a787f606caf8ea10d696482a4a1.pt
│ ├── a3f27635e5976f2f7040683193089c4ba7811a787f606caf8ea10d696482a4a1.spec.pt
│ ├── b69d4a066baec4f5390e9d621a0245202fc5090adf5c5c549885fd605928e589.pt
│ ├── b69d4a066baec4f5390e9d621a0245202fc5090adf5c5c549885fd605928e589.spec.pt
│ ├── be0533dcc16fb185a824504edb199d5862d70203c43ec4f0c45e3fa133a9b857.pt
│ ├── be0533dcc16fb185a824504edb199d5862d70203c43ec4f0c45e3fa133a9b857.spec.pt
│ ├── d461b6e9cbaf7964ff6bb12f01fa3d95ef32574cca435878f7e8b54b8552fcc5.pt
│ ├── d461b6e9cbaf7964ff6bb12f01fa3d95ef32574cca435878f7e8b54b8552fcc5.spec.pt
│ ├── d4690794bbded50e17fc257a6767d9e9f13ec24e9175f8cb1c94625030618f6a.pt
│ ├── d4690794bbded50e17fc257a6767d9e9f13ec24e9175f8cb1c94625030618f6a.spec.pt
│ ├── d4dcfbfc322144297f7491c1ad8f7a7b8261b0838ec1cd895e86dbd1187c21ed.pt
│ ├── d4dcfbfc322144297f7491c1ad8f7a7b8261b0838ec1cd895e86dbd1187c21ed.spec.pt
│ ├── d88b05d05baa089db447036092abeb80fdd4e1bb404744132deb9cd13c67bcdb.pt
│ ├── d88b05d05baa089db447036092abeb80fdd4e1bb404744132deb9cd13c67bcdb.spec.pt
│ ├── d89e07bfc4ae6ae5fca8ab3a965dad2bedeec11874a6d93c7b7f64cab3821671.pt
│ ├── d89e07bfc4ae6ae5fca8ab3a965dad2bedeec11874a6d93c7b7f64cab3821671.spec.pt
│ ├── e5de78e7e3ae1848f7f9c2a88981b4e0ebb89d33c392d73b1d5adcd87d53e44f.pt
│ ├── e5de78e7e3ae1848f7f9c2a88981b4e0ebb89d33c392d73b1d5adcd87d53e44f.spec.pt
│ ├── eb75989247b6048dda0c22839e1624a154e326c1f05b65fe2c7b46401ee6f2c9.pt
│ ├── eb75989247b6048dda0c22839e1624a154e326c1f05b65fe2c7b46401ee6f2c9.spec.pt
│ ├── ef8b43e347e846da70a9cf0b3d904fbc7075cd79e44adf2ccecf8b7be20b3530.pt
│ ├── ef8b43e347e846da70a9cf0b3d904fbc7075cd79e44adf2ccecf8b7be20b3530.spec.pt
│ ├── f2507ea125f5da92ad38333853f8ee7e0339cce79d9a3062adca8533a6477031.pt
│ ├── f2507ea125f5da92ad38333853f8ee7e0339cce79d9a3062adca8533a6477031.spec.pt
│ ├── f3a9dbeced79da786bf4642e54243a90774afb1df4a75cde7855f445f8466409.pt
│ ├── f3a9dbeced79da786bf4642e54243a90774afb1df4a75cde7855f445f8466409.spec.pt
│ ├── f4c618f3269e352f45480d0efc7a6d1a40fff479b566d5fa7c2de0634379423e.pt
│ ├── f4c618f3269e352f45480d0efc7a6d1a40fff479b566d5fa7c2de0634379423e.spec.pt
│ ├── fdefa487bf76ff204c9a1a3e8ac8137ab56f87c12dca4a8f8b4e570751609470.pt
│ └── fdefa487bf76ff204c9a1a3e8ac8137ab56f87c12dca4a8f8b4e570751609470.spec.pt
├── config.json
├── dataset.jsonl
└── lightning_logs [error opening dir]
I'm guessing your dataset.jsonl has the paths from your local machine, like /home/igancio. Have a look in it and see. Either pre-process your wav files IN the container or use the same paths in the container, or maybe do a search and replace in the dataset.jsonl.
I think it's easier to just use the exact same paths in the container.
So:
-v /home/ignacio/piper-checkpoints:/piper-checkpoints
becomes:
-v /home/ignacio/piper-checkpoints:/home/ignacio/piper-checkpoints
and the same for other -v options too.
You're almost there. This should work.
Thank you so much. On addition to the Dockefile above, these are the steps I followed (in case someone else is having the same troubles I did):
docker run --runtime=nvidia --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -ti -v /home/ignacio/ignacio-test:/ignacio-test -v /home/ignacio/out-train:/train -v /home/ignacio/piper-checkpoints:/piper-checkpoints cloning bash
source .venv/bin/activate
python3 -m piper_train.preprocess \
--language en-us \
--input-dir /ignacio-test/ \
--output-dir /train/ \
--dataset-format ljspeech \
--single-speaker \
--sample-rate 22050
python3 -m piper_train \
--dataset-dir /train/ \
--accelerator 'gpu' \
--devices 1 \
--batch-size 10 \
--validation-split 0.0 \
--num-test-examples 0 \
--max_epochs 10000 \
--resume_from_checkpoint /piper-checkpoints/epoch=4641-step=3104302.ckpt \
--checkpoint-epochs 1 \
--precision 32
Glad you were able to get it sorted. Hopefully anyone trying to train with docker container will find the thread. Not sure if you can change the title of the thread, but that would help them find it.
I think I got to the point of exporting the model:
DEBUG:fsspec.local:open file: /train/lightning_logs/version_3/checkpoints/epoch=9999-step=3157882.ckpt
`Trainer.fit` stopped: `max_epochs=10000` reached.
Where do i find model.ckpt and model.onnx? I tried running this:
(.venv) root@7806dca0c20f:/usr/src/piper/src/python# python3 -m piper_train.export_onnx \
> /ignacio-test/model.ckpt \
> /ignacio-test/model.onnx
Traceback (most recent call last):
File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/usr/src/piper/src/python/piper_train/export_onnx.py", line 109, in <module>
main()
File "/usr/src/piper/src/python/piper_train/export_onnx.py", line 42, in main
model = VitsModel.load_from_checkpoint(args.checkpoint, dataset=None)
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/core/saving.py", line 137, in load_from_checkpoint
return _load_from_checkpoint(
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/core/saving.py", line 184, in _load_from_checkpoint
checkpoint = pl_load(checkpoint_path, map_location=map_location)
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/pytorch_lightning/utilities/cloud_io.py", line 46, in load
with fs.open(path_or_url, "rb") as f:
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/fsspec/spec.py", line 1295, in open
f = self._open(
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/fsspec/implementations/local.py", line 180, in _open
return LocalFileOpener(path, mode, fs=self, **kwargs)
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/fsspec/implementations/local.py", line 302, in __init__
self._open()
File "/usr/src/piper/src/python/.venv/lib/python3.10/site-packages/fsspec/implementations/local.py", line 307, in _open
self.f = open(self.path, mode=self.mode)
FileNotFoundError: [Errno 2] No such file or directory: '/ignacio-test/model.ckpt'
(.venv) root@7806dca0c20f:/usr/src/piper/src/python#
You probably need something like:
python3 -m piper_train.export_onnx "/train/lightning_logs/version_3/checkpoints/epoch=9999-step=3157882.ckpt" /ignacio-test/model.onnx
Don't forget to copy the json file too.
cp /train/config.json /ignacio-test/model.onnx.json
Got the files and copy them to piper:
Alas, even after restarting piper and home assistant i cannot see my new voice
Any suggestions?
Can't help there. Don't know anything at all about Home Assistant, never used it. Hopefully someone who has will chime in. Maybe a post to the Home Assistant forums will help you get the custom voice added. My guess is there's voices in a config file somewhere or some scan operation needs to be performed.
Ignoring home assistant, get it to work inside piper you only need to copy the file next to the other ones? That is:
When I look at the log of my piper container it seems like piper is not aware that the file is there:
INFO:__main__:Ready
ERROR:asyncio:Task exception was never retrieved
future: <Task finished name='Task-9' coro=<AsyncEventHandler.run() done, defined at /usr/local/lib/python3.9/dist-packages/wyoming/server.py:28> exception=VoiceNotFoundError('ignacio')>
Traceback (most recent call last):
File "/usr/local/lib/python3.9/dist-packages/wyoming/server.py", line 35, in run
if not (await self.handle_event(event)):
File "/usr/local/lib/python3.9/dist-packages/wyoming_piper/handler.py", line 73, in handle_event
piper_proc = await self.process_manager.get_process(voice_name=voice_name)
File "/usr/local/lib/python3.9/dist-packages/wyoming_piper/process.py", line 114, in get_process
ensure_voice_exists(
File "/usr/local/lib/python3.9/dist-packages/wyoming_piper/download.py", line 77, in ensure_voice_exists
find_voice(name, data_dirs)
File "/usr/local/lib/python3.9/dist-packages/wyoming_piper/download.py", line 183, in find_voice
raise VoiceNotFoundError(name)
wyoming_piper.download.VoiceNotFoundError: ignacio
INFO:wyoming_piper.download:Downloaded /data/en_US-hfc_male-medium.onnx.json (https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/hfc_male/medium/en_US-hfc_male-medium.onnx.json)
INFO:wyoming_piper.download:Downloaded /data/en_US-hfc_male-medium.onnx (https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/hfc_male/medium/en_US-hfc_male-medium.onnx)
ERROR:asyncio:Task exception was never retrieved
future: <Task finished name='Task-23' coro=<AsyncEventHandler.run() done, defined at /usr/local/lib/python3.9/dist-packages/wyoming/server.py:28> exception=VoiceNotFoundError('ignacio')>
Traceback (most recent call last):
File "/usr/local/lib/python3.9/dist-packages/wyoming/server.py", line 35, in run
if not (await self.handle_event(event)):
File "/usr/local/lib/python3.9/dist-packages/wyoming_piper/handler.py", line 73, in handle_event
piper_proc = await self.process_manager.get_process(voice_name=voice_name)
File "/usr/local/lib/python3.9/dist-packages/wyoming_piper/process.py", line 114, in get_process
ensure_voice_exists(
File "/usr/local/lib/python3.9/dist-packages/wyoming_piper/download.py", line 77, in ensure_voice_exists
find_voice(name, data_dirs)
File "/usr/local/lib/python3.9/dist-packages/wyoming_piper/download.py", line 183, in find_voice
raise VoiceNotFoundError(name)
wyoming_piper.download.VoiceNotFoundError: ignacio
INFO:__main__:Ready
ERROR:asyncio:Task exception was never retrieved
future: <Task finished name='Task-6' coro=<AsyncEventHandler.run() done, defined at /usr/local/lib/python3.9/dist-packages/wyoming/server.py:28> exception=VoiceNotFoundError('ignacio')>
Traceback (most recent call last):
File "/usr/local/lib/python3.9/dist-packages/wyoming/server.py", line 35, in run
if not (await self.handle_event(event)):
File "/usr/local/lib/python3.9/dist-packages/wyoming_piper/handler.py", line 73, in handle_event
piper_proc = await self.process_manager.get_process(voice_name=voice_name)
File "/usr/local/lib/python3.9/dist-packages/wyoming_piper/process.py", line 114, in get_process
ensure_voice_exists(
File "/usr/local/lib/python3.9/dist-packages/wyoming_piper/download.py", line 77, in ensure_voice_exists
find_voice(name, data_dirs)
File "/usr/local/lib/python3.9/dist-packages/wyoming_piper/download.py", line 183, in find_voice
raise VoiceNotFoundError(name)
wyoming_piper.download.VoiceNotFoundError: ignacio
But if i get inside the piper container the files are clearly there:
$ ls -la
total 1055100
drwxrwxrwx 1 1026 users 1436 Dec 15 17:53 .
drwxr-xr-x 1 root root 4096 Dec 15 17:27 ..
-rwxrwxrwx 1 1026 users 63104526 Nov 24 16:38 en_GB-alan-low.onnx
-rwxrwxrwx 1 1026 users 4170 Nov 24 16:38 en_GB-alan-low.onnx.json
-rwxrwxrwx 1 1026 users 63201294 Nov 24 16:39 en_GB-alan-medium.onnx
-rwxrwxrwx 1 1026 users 4888 Nov 24 16:39 en_GB-alan-medium.onnx.json
-rwxrwxrwx 1 1026 users 76952753 Nov 24 16:38 en_GB-vctk-medium.onnx
-rwxrwxrwx 1 1026 users 6637 Nov 24 16:38 en_GB-vctk-medium.onnx.json
-rwxrwxrwx 1 1026 users 63104526 Nov 20 22:16 en_US-amy-low.onnx
-rwxrwxrwx 1 1026 users 4164 Nov 20 22:16 en_US-amy-low.onnx.json
-rwxrwxrwx 1 1026 users 63201294 Dec 15 17:53 en_US-hfc_male-medium.onnx
-rwxrwxrwx 1 1026 users 5033 Dec 15 17:53 en_US-hfc_male-medium.onnx.json
-rwxrwxrwx 1 1026 users 63511038 Dec 15 15:53 en_US-ignacio-low.onnx
-rwxrwxrwx 1 1026 users 7082 Dec 15 17:49 en_US-ignacio-low.onnx.json
-rwxrwxrwx 1 1026 users 63104526 Nov 22 06:27 en_US-kathleen-low.onnx
-rwxrwxrwx 1 1026 users 4169 Nov 22 06:27 en_US-kathleen-low.onnx.json
-rwxrwxrwx 1 1026 users 113895201 Nov 20 22:15 en_US-lessac-high.onnx
-rwxrwxrwx 1 1026 users 4883 Nov 20 22:15 en_US-lessac-high.onnx.json
-rwxrwxrwx 1 1026 users 63201294 Nov 24 16:40 en_US-lessac-low.onnx
-rwxrwxrwx 1 1026 users 4882 Nov 24 16:40 en_US-lessac-low.onnx.json
-rwxrwxrwx 1 1026 users 136673811 Nov 24 16:40 en_US-libritts-high.onnx
-rwxrwxrwx 1 1026 users 20163 Nov 24 16:40 en_US-libritts-high.onnx.json
-rwxrwxrwx 1 1026 users 120786792 Nov 24 16:40 en_US-ryan-high.onnx
-rwxrwxrwx 1 1026 users 4166 Nov 24 16:40 en_US-ryan-high.onnx.json
-rwxrwxrwx 1 1026 users 63104526 Nov 20 22:15 en_US-ryan-low.onnx
-rwxrwxrwx 1 1026 users 4165 Nov 20 22:15 en_US-ryan-low.onnx.json
-rwxrwxrwx 1 1026 users 63201294 Nov 20 19:34 en_US-ryan-medium.onnx
-rwxrwxrwx 1 1026 users 4883 Nov 20 19:34 en_US-ryan-medium.onnx.json
-rwxrwxrwx 1 1026 users 63201294 Nov 21 16:49 es_MX-ald-medium.onnx
-rwxrwxrwx 1 1026 users 4889 Nov 21 16:51 es_MX-ald-medium.onnx.json
-rwxrwxrwx 1 1026 users 4889 Nov 21 16:49 es_es_MX_ald_medium_es_MX-ald-medium.onnx.json
I'm trying to follow https://github.com/rhasspy/piper/blob/master/TRAINING.md and I got the step of getting into the container you recommend. However, it is not clear from the documentation if I have to install additonal things, or if I have to go to some given directory. This is what I did:
I'm probably missing something obvious, but it would be great if that obvious step was documented. Thanks!