Open deryaguler95 opened 1 year ago
It seems that your training dataset includes digits, "1" specifically.
The optimal way to solve it would be to convert all numbers in dataset from digit representation into letter representation (f.e. "1" -> "one"). This is tedious, but will give the optimal inference results.
Alternatively, you can just add numbers to text/symbols.py file. Something like that:
""" from https://github.com/keithito/tacotron """
''' Defines the set of symbols used in text input to the model. ''' pad = '' _punctuation = ';:,.!?¡¿—…"«»“” ' _letters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz' _letters_ipa = "ɑɐɒæɓʙβɔɕçɗɖðʤəɘɚɛɜɝɞɟʄɡɠɢʛɦɧħɥʜɨɪʝɭɬɫɮʟɱɯɰŋɳɲɴøɵɸθœɶʘɹɺɾɻʀʁɽʂʃʈʧʉʊʋⱱʌɣɤʍχʎʏʑʐʒʔʡʕʢǀǁǂǃˈˌːˑʼʴʰʱʲʷˠˤ˞↓↑→↗↘'̩'ᵻ" _numbers = "1234567890"
symbols = [_pad] + list(_punctuation) + list(_letters) + list(_letters_ipa) + list(_numbers)
SPACE_ID = symbols.index(" ")
I got below error. raise ProcessRaisedException(msg, error_index, failed_process.pid) torch.multiprocessing.spawn.ProcessRaisedException: -- Process 1 terminated with the following error: raise exception KeyError: Caught KeyError in DataLoader worker process 0. Original Traceback (most recent call last): sequence = [_symbol_to_id[symbol] for symbol in cleaned_text] KeyError: '1'
O could not fix that. Are there any idea?