Closed Tetragramm closed 4 months ago
Hi @Tetragramm
Thanks for the diagnostics and details. Can I ask, when you are seeing the failures do you have sentences that are 2-3 characters in length with the generations? e.g.:
[AllTalk TTSGen] Character (Text-not-inside) [AllTalk TTSGen] She
These should be dropped from the generation and not added to the catalogue of wav files, however, its possible there is a hidden/non-visible character we cant see in there that's causing something odd to happen.
Text generations 3 or less characters should be automatically removed, not generated and so not added to the list of WAV files to be combined. The specific error you are experiencing is when it combines the wav files into one file:
UnboundLocalError: cannot access local variable 'sample_rate' where it is not associated with a value
The combine reads all the wav files generated, confirms they are valid files/have the same sample rate and then combines them. What seems to be happening here (best I can tell) is that one file either doesn't exist on disk OR didn't return a sample rate value. Im slightly baffed as to why, but maybe I can suggest something you can try and let me know if it resolves your issues.
On line 676 of the script.py
is if len(part.strip()) <= 3:
Try changing the 3
to a 1
and see if that resolves the problem you are experiencing?
Without seeing the original text in its raw form, its hard for me to breakdown exactly why you have that short She as a generation and not a full sentence. If it works, I can always add this as an advanced variable that can be changed in the interface.
Let me know
Thanks
I don't believe they are all that short. I wasn't paying much attention to the failed generation's text though. I will check the next time I see the failure.
No problem. It would be any of the individual sentences that are in the generation, so if its combining lets say 6 sentences together, it doesn't just have to be the last sentence out of the 6.
That aside Im puzzles as to what else it could be. Though maybe a corrupt wav generated. Let me know.
Thanks
Hi @Tetragramm
Not sure if you resolved this or not. Im going to add in various additional controls in the next version of AllTalk I upload, so this setting will be part of that.
If you need to get back to me, please do so.
Thanks
🔴 If you have installed AllTalk in a custom Python environment, I will only be able to provide limited assistance/support. AllTalk draws on a variety of scripts and libraries that are not written or managed by myself, and they may fail, error or give strange results in custom built python environments.
🔴 Please generate a diagnostics report and upload the "diagnostics.log" as this helps me understand your configuration.
https://github.com/erew123/alltalk_tts/tree/main?#-how-to-make-a-diagnostics-report-file
Describe the bug Using text-generation-webui. As the context length creeps up, the TTS step fails more often. For this particular conversation, it is failing every time, despite regenerating text. Often I can regenerate, I get the same broken text but the TTS gen includes the
<audosrc=...
of the previous TTS (of the broken text), and then a second regenerate succeeds.To Reproduce Steps to reproduce the behaviour: Not consistent, happens pretty randomly as I generate text.
Screenshots If applicable, add screenshots to help explain your problem.
Text/logs
Desktop (please complete the following information): AllTalk was updated: 2024/04/17 Custom Python environment: no Text-generation-webUI was updated: 2024/04/17
Additional context diagnostics.log