aedocw / epub2tts

Turn an epub or text file into an audiobook
Apache License 2.0
528 stars 50 forks source link

Error: ❗ XTTS can only generate text with a maximum of 400 tokens. ... Retrying (1 retries left) #201

Closed Nikanoru closed 8 months ago

Nikanoru commented 8 months ago

Hey! Sorry it is me again^n... :P I keep getting this error every time currently for the exact same spot. I know I have TTS'ed this entire Book before with a non-XTTS voice and it worked fine, but now it does not finish anymore. Not sure if it is because of the XTTS or something else.

image

aedocw commented 8 months ago

I think I know what I have to change to address this. In the mean time are you able to test with a different branch? If so, please try:

cd epub2tts
git fetch --all
git checkout 193-max-tokens
python epub2tts.py <whatever parameters you previously used>

I'm not sure that will fix it, but there is at least a chance this will. Let me know if you're able to do that.

Nikanoru commented 8 months ago

Sure I will give it a try! Thanks for the suggestion. Will report back soon.

Nikanoru commented 8 months ago

It seems to be working :) Thank you! image

aedocw commented 8 months ago

Excellent, thanks, this is helpful. I have been looking through the parts where I break things up into sentences and there's a lot of room for improvement. At least now I know this helps!

One other thing you can do to speed things up is disable the transcript comparison (it looks like it is frequently getting a low ratio and trying again). You can do this by adding --minratio 0.

Nikanoru commented 8 months ago

Excellent, thanks, this is helpful. I have been looking through the parts where I break things up into sentences and there's a lot of room for improvement. At least now I know this helps!

One other thing you can do to speed things up is disable the transcript comparison (it looks like it is frequently getting a low ratio and trying again). You can do this by adding --minratio 0.

Thank you for the tip with the --minratio 0. I thought about asking if there is a solution for this, but wasnt sure if I should just add it to my question here or make a new thread for it. Issue solved hehe

Also I only seem to be getting this "doesnt sound right, retrying thing" for the "Notes" section of the book which has a ton of links in it. The friend of mine for whom I am making this wanted those included in the audiobook, otherwise I would have just let them out. Good thing he wanted to keep them in, otherwise I would not have ecountered the 400 token issue and it sounded like knowing the solution you suggested works, helped you (and me) aswell.

Nikanoru commented 8 months ago

Oh no! It started breaking again. Still the same chapter as before, but much further along. image

aedocw commented 8 months ago

OK thanks for the update. I think I know how to tackle this but it might be a day or two before I get to it. I'll update this issue as soon as I've got something you can test with.

aedocw commented 8 months ago

Could you do one more test for me, and add --debug to your epub2tts command? This will make it very chatty, but should show the exact sentence that is making it choke. It should be less than 225 words... I'm trying to figure out if one of the steps that tries to break things into sentences is failing and accidentally making a really long one that causes this error.

Nikanoru commented 8 months ago

--debug

Here is the result:

image

You know, when this issue initially popped up, I listened to the tempwav just before the error, to find the text part which causes problems and I found the text, but my eyes could not see anything unusual about it.

aedocw commented 8 months ago

Ah-ha! It's that URL that is throwing it for a loop! One MORE test, could you try adding --skiplinks and see if that catches and skips reading that URL (and any others).

aedocw commented 8 months ago

To expand on this a little bit: I added --skiplinks for my own use in my earliest version because I did not want to hear links read out, which were usually in the footnotes. I can imagine there may be some stories where a URL is important, but I can not recall ever hearing one read out in an audiobook made professionally.

One thing you could do for situations like this is to manually replace the URL with text that you would like to hear. The easiest way to do this is to export the book to text, make your modifications, and then create the audiobook from the text. That will not help you in this case where you've already produced a large portion of the audiobook, unless you're happy starting over.

Also it looks like this may be a footnote? There is also --skipfootnotes option, but that doesn't always work well since it seems like every book indicates where footnotes are differently. For me personally, I've been exporting books to text and removing things like footnotes at the end of chapters.

The process would be:

epub2tts mybook.epub --export txt
edit mybook.txt, and note chapter breaks look like "# Chapter 1"
epub2tts mybook.txt
Nikanoru commented 8 months ago

--skiplinks

--skiplinks doesnt help, I had tried that yesterday already, but tried it again now and still I get the same error image I was surprised that --skiplinks does not seem to change it. I am already using the .txt file, so you are 100% correct I could just edit out that link. If I would make the audio book for myself, I would have thrown out that entire part^^, but the friend for whom this is, seems to want them in the audiobook.

Here is how that section looks in the .txt file

image

I have not tried the --skipfootnotes option yet iirc. Will try that! Thank you :)

Edit: --skipfootnotes also no success

aedocw commented 8 months ago

OK --skipfootnotes really depends on the formatting of the book (for instance if it's an epub and there is a chapter that starts with the word "Footnotes").

Can you try in the text on that exact line replacing the URL with "long URL redacted" (or something like that)? Short of doing that, I'm not sure what else to do. That might be the only URL that is just long enough to confuse TTS :)

Nikanoru commented 8 months ago

OK --skipfootnotes really depends on the formatting of the book (for instance if it's an epub and there is a chapter that starts with the word "Footnotes").

Can you try in the text on that exact line replacing the URL with "long URL redacted" (or something like that)? Short of doing that, I'm not sure what else to do. That might be the only URL that is just long enough to confuse TTS :)

Hehe yeah! I just asked my friend if it is ok for him if I leave out this URL and explained to him why. I should probably have done this from the start and saved you a bunch of time. Thank you very much for the help and assistance :)

That is a good idea with the "long URL Redacted". I think I will call it "long URL Redacted #1" and add a readme along with the audio book which has a numbered list for all the redacted urls (if there will even be more than 1).

Edit: Also I wanted to say, the new landing page with the updated explanations/guides is really great! I noticed one typo in the "Usage" -> "All options" section where it says Damien BLack, I think the L got capitalized by accident, not sure if that could cause an issue or not but it stood out to me :)

aedocw commented 8 months ago

Thanks for pointing out the typo on the README, it's fixed. Also glad you like the updated page, feels like a big improvement :)

I'm going to close this bug as I think we found the issue. It HAS prompted me to write some good instructions for how to report bugs, to make the first round of troubleshooting easier and faster when folks inevitably find more issues down the road.

As always, thanks for using this and reaching out for help when you find problems. For sure it will help other people who might be reluctant to file a bug, so it reall makes a difference - thank you!