joelpurra / talkie

Text-to-speech browser extension button. Select text on any web page, and have the computer read it out loud for you by simply clicking the Talkie button.
https://joelpurra.com/projects/talkie/
GNU General Public License v3.0
70 stars 17 forks source link

Missing audio and the start of paragraphs or bullets #25

Open PastTenseOfDraw opened 2 years ago

PastTenseOfDraw commented 2 years ago

Please fill out the following to help us help you. Replace ... with your own values, where applicable.

Expected behavior

Reading the full line

Actual behavior

It starts to read a few words or syllables in

Steps to reproduce behavior

The easiest way is to listen to b

Website

Text and language

The text that is in italics is being skipped.

_As_sess We designed the initial leg of our one-on-one coaching course to find your digital accessibility knowledge baseline. You will discover your current status within the technical realm of accessibility, identify any gaps in your knowledge, understanding, or execution of accessibility standards. _Communic_ation Our communication coaching encourages you to embrace your ability to understand inclusion with empathy and develop and hone your presentation skills. The ability to translate your ethics, diversity, and inclusivity standards into a digestible format is key to ensuring that your accessibility career is flourishing and long- lasting. _Conne_ct It makes sense that once you have achieved a level of cohesion and understanding within an organization, you would want to ensure that your clients are aware of your accessibility efforts but that they understand and support them as well. Our coaching will promote your ability to sell your newly embraced culture. _Enga_ge Time for execution, time to put action to words. The final stage of our training will ensure that you can enact the changes you have outlined within your team, workplace, or organization. It is the climax of the one-on-one coaching where you make accessibility natural, real and normal.

System information

Additional information

...

joelpurra commented 1 year ago

@PastTenseOfDraw: thank you for the report! The page seems to have been deleted, but I replaced it with an Internet Archive link.

Am unfortunately not able to reproduce the issue on current versions of browsers on macOS 10.14 (Chrome, Firefox), Linux/Ubuntu 22.04 (Firefox, Chrome, Edge), Windows 10 (Firefox, Chrome, Edge, Vivaldi). Tested using a few, but not all available, voices.

My guess is that the particular voice you were using does (did?) not work as intended. It may have been a (temporary?) internet connectivity issue when using an "online" voice, augmented by speaking shorter text "parts" which each require streaming a prepared audio snippet. If there's a delay, the voice may choose to "skip" some parts to compensate. (Why compensate? Not sure, but could even be something on the audio hardware/driver level.)

Do you still have issue, also on other pages? If so, please answer the following.


Since Talkie merely hands over the text to the voice, via the browser TTS engine, the voice is responsible for the quality. Some voices do have issues, similar to stuttering at the start of sentences/parts, but not usually bad enough to skip several words. If you had issues using an "online" voice, such as many Google voices builtin to Chrome, internet connectivity and speed matters.

If connectivity is flaky, one factor might be that Talkie can (be configured to) either speak shorter sentences/parts, or the whole selected text at once. By this I mean that the TTS engine is either spoon-fed small parts, or gets the entire text so that it can act on it as a whole (adjusting for natural pauses, question/exclamation marks at the very end of long sentences, etcetera). Speaking short parts is the default in Chrome, due to long-standing bugs where speech would stop after a short while. (See #1; also mentioned this in #22.)

Sometimes (I suspect that) initializing the audio engine takes relatively much time for each separately spoken text part. This may also depend on system hardware/drivers, settings, etcetera. For example, speaking text over wireless speakers/headphones may introduce audio buffering issues, which cause speakers to "skip" parts to "catch up". Speaking the entire text at once may work for your operating system/browser/voice; see Talkie's settings.


The provided section, including the heading above, is rendered from the (shortened) HTML below.

<h2 class="heading-2">What Does The Coaching Include?</h2>
<ol role="list">
    <li><strong>Assess</strong><br>We designed the initial leg of ...</li>
</ol>

Talkie doesn't (directly) bother with HTML tags though -- the selected text is extracted from HTML by the browser's DOM API. Speaking this selection in small parts would feed the "label" of the bullet point ("Assess") separately from the rest of the text ("We designed the initial leg of ..."). This is due to a hardcoded line-break <br/> after the <strong> label. (The semantic HTML style can, as always, be discussed.)

When reading the section, including the heading, Talkie (via the browser's DOM) extracts the (shortened) text below.

"What Does The Coaching Include?\nAssess\nWe designed the initial leg of ..."

You can see this if you open the browser console for the page when using Talkie. Line-breaks (\n versus \r\n variations), and "indentation" spaces (and thus total text length) may differ between browsers.

Notice the \n line-breaks after the heading and the bullet point "label"; they are interpreted as "part" separators by Talkie. It seems the text audibly being skipped when you read it was at the start of parts, being fed to the TTS engine separately. Again, this may point to various system issues -- but most likely an online voice and a flaky internet connection.