Closed kin0303 closed 2 years ago
I can suggest 2 fixes that you might try:
allow_pickle=True
in np.load(cache_path)
, like np.load(cache_path , allow_pickle=True)
at /media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 579
@blackmamba1122 Looks like some of the phonemes cached are corrupted. You need to delete the cache directory or change the phonemes cache directory ("phoneme_cache_path" parameter on config) forcing the TTS to recompute it.
I can suggest 2 fixes that you might try:
- Maybe move the cache folder temporary to different location and let it rebuild.
- add
allow_pickle=True
innp.load(cache_path)
, likenp.load(cache_path , allow_pickle=True)
at/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 579
I've try number 2, but I got error like this:
--> STEP: 1209/5840 -- GLOBAL_STEP: 1691210
| > decoder_loss: 0.56349 (0.80207)
| > postnet_loss: 0.49640 (0.72210)
| > stopnet_loss: 0.85311 (0.30274)
| > decoder_coarse_loss: 0.88406 (1.23891)
| > decoder_ddc_loss: 0.00170 (0.00865)
| > ga_loss: 0.00004 (0.00041)
| > decoder_diff_spec_loss: 0.36727 (0.39819)
| > postnet_diff_spec_loss: 0.32913 (0.35275)
| > decoder_ssim_loss: 0.12858 (0.25808)
| > postnet_ssim_loss: 0.11792 (0.23859)
| > loss: 1.57546 (1.30963)
| > align_error: 0.60392 (0.42270)
| > grad_norm: 1.56272 (3.97671)
| > current_lr: 0.00000
| > step_time: 2.24570 (1.13703)
| > loader_time: 0.00220 (0.00179)
! Run is kept in /media/DATA-2/TTS/TTS_Coqui/TTS-July-28-2022_09+54AM-68cef28a
Traceback (most recent call last):
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/trainer/trainer.py", line 1492, in fit
self._fit()
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/trainer/trainer.py", line 1476, in _fit
self.train_epoch()
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/trainer/trainer.py", line 1254, in train_epoch
for cur_step, batch in enumerate(self.train_loader):
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 530, in __next__
data = self._next_data()
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1224, in _next_data
return self._process_data(data)
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
data.reraise()
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/_utils.py", line 457, in reraise
raise exception
AssertionError: Caught AssertionError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 180, in __getitem__
return self.load_data(idx)
File "/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 230, in load_data
token_ids = self.get_token_ids(idx, item["text"])
File "/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 213, in get_token_ids
token_ids = self.get_phonemes(idx, text)["token_ids"]
File "/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 198, in get_phonemes
assert len(out_dict["token_ids"]) > 0
AssertionError
@blackmamba1122 Looks like some of the phonemes cached are corrupted. You need to delete the cache directory or change the phonemes cache directory ("phoneme_cache_path" parameter on config) forcing the TTS to recompute it.
I'll try this one and I report it back
@blackmamba1122 Looks like some of the phonemes cached are corrupted. You need to delete the cache directory or change the phonemes cache directory ("phoneme_cache_path" parameter on config) forcing the TTS to recompute it.
I'll try this one and I report it back
Still error
--> STEP: 1209/5840 -- GLOBAL_STEP: 1691210
| > decoder_loss: 0.58440 (0.80216)
| > postnet_loss: 0.51266 (0.72178)
| > stopnet_loss: 0.84992 (0.29996)
| > decoder_coarse_loss: 0.89247 (1.24166)
| > decoder_ddc_loss: 0.00162 (0.00863)
| > ga_loss: 0.00004 (0.00034)
| > decoder_diff_spec_loss: 0.37415 (0.39893)
| > postnet_diff_spec_loss: 0.33291 (0.35312)
| > decoder_ssim_loss: 0.12760 (0.25786)
| > postnet_ssim_loss: 0.11682 (0.23833)
| > loss: 1.58579 (1.30728)
| > align_error: 0.60102 (0.41433)
| > grad_norm: 4.16512 (4.11554)
| > current_lr: 0.00000
| > step_time: 2.44260 (1.16248)
| > loader_time: 0.00260 (0.00190)
! Run is kept in /media/DATA-2/TTS/TTS_Coqui/TTS-July-28-2022_09+54AM-68cef28a
Traceback (most recent call last):
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/trainer/trainer.py", line 1492, in fit
self._fit()
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/trainer/trainer.py", line 1476, in _fit
self.train_epoch()
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/trainer/trainer.py", line 1254, in train_epoch
for cur_step, batch in enumerate(self.train_loader):
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 530, in __next__
data = self._next_data()
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1204, in _next_data
return self._process_data(data)
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
data.reraise()
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/_utils.py", line 457, in reraise
raise exception
AssertionError: Caught AssertionError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 180, in __getitem__
return self.load_data(idx)
File "/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 230, in load_data
token_ids = self.get_token_ids(idx, item["text"])
File "/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 213, in get_token_ids
token_ids = self.get_phonemes(idx, text)["token_ids"]
File "/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 198, in get_phonemes
assert len(out_dict["token_ids"]) > 0
AssertionError
I am done with this problem. If you meet this problem you can do this way:
Hi there,
I have to test it further, but in a multi-gpu setting some of the workers fail with the error ValueError: Cannot load file containing pickled data when allow_pickle=False
. I don't know exactly why, but passing the argument allow_pickle=True
in the np.load
of the compute_or_load
method of the PhonemeDataset
class seems to fix the issue.
I think that this may be because np.save
allows pickle by default, while the load funcion doesn't. Not sure why this is problematic only in the multi-gpu setting.
I'll post updates here but I'd propose to pass the argument allow_pickle=True
to the load function since the phonemes cache is created by the library and there's not a big security risk. What do you think?
Describe the bug
I had training tacotron 2 for a while and now I want to add sample audio for one speaker. When I run using
I got error like this:
Environment