pgmichael / wavenet-for-chrome

Chrome extension that transforms highlighted text into high-quality natural sounding audio using Google Cloud's Text-to-Speech.
http://wavenet-for-chrome.com
MIT License
132 stars 52 forks source link

Seemingly random skipping of sentences when reading over 5000 characters. #41

Closed superluig164 closed 3 years ago

superluig164 commented 3 years ago

I've noticed that sometimes portions of sentences and sometimes entire sentences get completely skipped when reading >5000 characters. My guess is that it has something to do with the ReGex that handles splitting the text, but I can't be sure of that. Either way, it doesn't happen with every body of text. I think it happens more in bodies with odd characters sprinkled in, but I've also had the last sentence just get skipped with no explanation before. Easiest way to test would be to just select all the text on various sites and see if it reads everything.

pgmichael commented 3 years ago

I haven't stubbled on this issue, although I haven't been using the extension a lot as of late.

If the issue was regex related, it would be replicable by re-synthesizing the same text. This leads me to believe it might be a network error.

How frequently does this issue seems to occur?

superluig164 commented 3 years ago

I should have tried that, now I'm kicking myself. It happens kinda randomly, often enough to notice, but not often enough to be that annoying. I can just read along and fill in the sentence by reading it. I don't notice a delay or anything between the sentences, so I imagine if it was a network error it would just take a long time to try to download the audio, and when it finally failed, THEN move on to the next sentence, right?

That being said, my internet is not bad. I'm supposed to have 100Mb up/dn, although on average I get about 40-50 dn and 90 up. Still, that's not bad internet, and I've not noticed any similar issues when reading smaller passages, and I also have my phone with Tasker reading out my notifications using WaveNet almost constantly and it's never missed anything either.

pgmichael commented 3 years ago

No worries! I'm actually not to sure how it will behave if one of the sentence fails while reading.

That being said, I've added sentry to the extension which should help me spot any widespread issues if there are any.

I'll add an update here if I manage to spot or fix the issue, but don't hesitate to let me know if you ever find a way to reproduce it.

superluig164 commented 3 years ago

Okay, I think I may have found a repeatable scenario.

This is just a random comment from YouTube. Disregard its contents. Starts here:

For anyone who is interested what exactly those bottles in the first aid kit First bottle is iodine solution, used for cuts/burns disinfection Second bottle is valerian tincture has a very calming effect, useful when driver/passenger is in shock Third bottle is boric acid solution, antiseptic useful in eye injuries Forth bottle is iodine solution again. There is never too much iodine solution

Pills that Doug handles is charcoal I also see validol (hearth problem), valerian, analgin (US analog is tylenol) and besalol (no idea what it is, instruction says it is for stomach pain) in the instruction list on top on the aid kit

I am not a doctor

And ends here.

If I copy from the r in doctor to the F in For (that is, everything) and read it, it will read until the word "again." on the fifth line. After that... it's all skipped. I have tested it on YouTube, in a pastebin, and in the editing box of this video, and it consistently skips that last section of the text.

I hope this is helpful.

@pgmichael

superluig164 commented 3 years ago

To confirm (I'm not editing the post because I don't want it to potentially interrupt anything) I just tested reading it from the post itself, using the keyboard shortcut, and it again repeated the same behaviour. It also occured to me that it seems to not be unique to reading >5000 characters. I suppose just reading a large passage increased the likelyhood of it happening.

pgmichael commented 3 years ago

Thanks for the giving me a replicable scenario @superluig164.

Looks like the regex was at fault here and I've swapped it with something that should work more reliably.

Update will be live as soon as it's approved by Google.

superluig164 commented 3 years ago

I was right! :P thanks for fixing it, glad I could help!