rhasspy / piper

A fast, local neural text to speech system
https://rhasspy.github.io/piper-samples/
MIT License
5.97k stars 434 forks source link

problem pree processing a norwegian dataset #564

Open Rem56445 opened 1 month ago

Rem56445 commented 1 month ago

hello people. whenever I go to pre process a norwegian dataset. I get the following error. UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe6 in position 79: invalid continuation byte
I assume this is because of the norwegian unicode, but how do I fix it?

agonzalezd commented 1 week ago

your data probably won't be in utf8. try changing the encoding of the text files