Open Schokodrache opened 2 years ago
I receive the following error using language "German": _Type: FileNotFoundError Text: [Errno 2] No such file or directory: 'alphabets\German.txt' Full: Traceback (most recent call last): File "flask\app.py", line 1950, in full_dispatch_request File "flask\app.py", line 1936, in dispatch_request File "application\views.py", line 121, in create_dataset_post File "application\views.py", line 87, in get_symbols File "training\utils.py", line 189, in loadsymbols FileNotFoundError: [Errno 2] No such file or directory: 'alphabets\German.txt'
I created this directory with the German.txt file from the source code and used the "combine clips" option. Building the dataset delivers Matched 458 segments Combining clips Produced 4 final clips
Here's the audio file: https://drive.google.com/file/d/1FIc8vdIBoE8aXneKrouYGJwnGj94qS_S/view?usp=sharing Original text file: Haushaltsgesetz_2022.txt Text file in dataset folder after process without umlauts: text.txt
Thanks a lot for looking into this problem!
I just ran a test and on my end I get: Size: 0 hours, 31 minutes Total clips: 255
Maybe something went wrong during setup? Normally it should not be required to add german language manually.
@Schokodrache As for the error with disappering umlauts. Please convert the file to utf-8 encoding and try again.
@BenAAndrew Maybe this needs to be added as some hint or something. If a source text contains special characters the file needs to be utf-8 otherwise the characters will just disapear.
@SirBitesalot Good idea. Please open a PR on the relevant page
This definitely goes into the right direction, thanks! The text.txt file now includes the umlauts. Yet, still only a few segments are identified, see metadata. csv file. Very strange. metadata.csv What could go wrong during setup? The aphabets folder is present in a folder called User/Appdata/Local/temp/_MEI198762. This folder seems to be created when the exe is executed, but I ran the exe from another folder, and still alphabets\German.txt was not found.
Meanwhile, I've built the app myself, and now everything works fine for the German language. So the original problem seems somehow to be related to the ready-to-use build. Thanks again for the support!
Did you modify the German alphabet file? The packaged app uses this version so it wouldn't have the changes you made to the source code
No, I haven't changed anything in the alphabet file. Building the app from the source code solved all problems.
Got the same problem when trying to use german , the files are there , even exchanging them with the current ones , wont help
Voice Cloning works flawlessly using the English language – really powerful tool, thanks a lot!
But there are several bugs for the German language.
-German.txt must be manually put in alphabets subfolder, otherwise an error message appears while building the dataset. -Matching the segments does not work properly, neither for the built-in version of German language, nor for a newly created language version with coqui files. Only two minutes of text are identified from a source file of 30 minutes (source quality is good). -German.txt contains umlauts and special characters, but these are missing / cut out in the final metadata.csv
The automation of segmenting is one of the most helpful features of your app, so it would be extremely helpful if this would also work for non-English languages. Thanks!