drmfinlay / tts-util-app

TTS Util — Text-to-speech utility Android app for synthesising text into audible speech
Apache License 2.0
109 stars 19 forks source link

insert a custom pause between phrases #18

Closed Balamoote closed 2 years ago

Balamoote commented 2 years ago

I would be nice to add a setting:

Could be really useful for some people.

drmfinlay commented 2 years ago

Hello Balamoote,

Thank you for opening this issue. A short pause of an hundred milliseconds is currently inserted between utterances and new lines. A settings screen for customising pause values can certainly be added. I'll look into this. In the meantime, do you have any suggestions for default pause values for the cases you mention?

Just in case you are unaware, the speech rate may be changed in the settings. Open the sidebar menu, select Settings, then Speech Rate. This may not work for some text-to-speech engine apps or available voices.

Cheers, Dane Finlay

On Sun, 19 Dec 2021 14:32:07 -0800 Balamoote @.***> wrote:

I would be nice to add a setting:

  • add a custom pause between phrases (after . ! ? - — – ...)
  • add custom pauses between words

Could be really useful for some people.

Balamoote commented 2 years ago

Hello Danesprite! Yes, I am aware of the setting, thank to point that one out.

As to the default pause settings, I would suggest 200 ms after a sentence and 400 after a paragraph. However, this heavily depends on one's personal preferrances and the text itself. That is why it would be nice to able to tweak the values depending on the situation.

drmfinlay commented 2 years ago

Hello Balamoote,

These defaults sound reasonable to me. I agree that the values should be configurable for the reasons you mention.

Thinking about it, this would seem to conflict somewhat with the speech rate setting. For instance, if one had the speech rate at 3x, it may be a little jarring to have pauses that do not take this factor into consideration. It would not be difficult to apply this factor before dispatching the silent pause utterances.

So, using a 3x speech rate and your pauses values for sentences and paragraphs, the application would, with this adjustment, reduce the pauses by a third to 67 ms and 133 ms respectively.

Do you think it would be acceptable to do this and have a warning on the settings screen for the pauses? Something like "A custom speech rate will be applied to these pause values"?

On Sat, 25 Dec 2021 12:10:06 -0800 Balamoote @.***> wrote:

Hello Danesprite! Yes, I am aware of the setting, thank to point that one out.

As to the default pause settings, I would suggest 200 ms after a sentence and 400 after a paragraph. However, this heavily depends on one's personal preferrances and the text itself. That is why it would be nice to able to tweak the values depending on the situation.

Balamoote commented 2 years ago

Hello Danesprite,

I think maybe an additional option "pause is proportional to speech rate" could solve this (and a warning won't hurt). Basically, a situation must be avoided when user sets a duration of, say, 500 ms at speed 2 and then wonders, why the pause is only 250 ms. For my use-case, only absolute numbers are of any practical meaning.

Another way is just define pauses for speech rate "1.0" and let the user do the math.

Usually, I would choose a certain speech rate and then adjust pauses - and here only absolute durations in ms make sense. Also if you consider Google speech engine. From version to version, the default perceived speech rate is sometimes different and needs to be adjusted but not the pauses. The pauses define structure of the text and I prefer them to be "standard" for every situation.

drmfinlay commented 2 years ago

Hello Balamoote,

My apologies for taking a while to respond to your comment.

Thank you for suggesting an additional "pause is proportional to speech rate" option. I think this is the most sensible way of solving the problem, especially given what you mention about the default speech rate differing between engines and engine versions. I had noticed something similar with the pitch setting.

I hope to release a new version (v3) of the application next week. Customisable pause values will have to wait until the version after that. I'll add your suggested option in this version too.

On Sun, 26 Dec 2021 00:03:23 -0800 Balamoote @.***> wrote:

Hello Danesprite,

I think maybe an additional option "pause is proportional to speech rate" could solve this (and a warning won't hurt). Basically, a situation must be avoided when user sets a duration of, say, 500 ms at speed 2 and then wonders, why the pause is only 250 ms. For my use-case, only absolute numbers are of any practical meaning.

Another way is just define pauses for speech rate "1.0" and let the user do the math.

Usually, I would choose a certain speech rate and then adjust pauses - and here only absolute durations in ms make sense. Also if you consider Google speech engine. From version to version, the default perceived speech rate is sometimes different and needs to be adjusted but not the pauses. The pauses define structure of the text and I prefer them to be "standard" for every situation.

drmfinlay commented 2 years ago

Hello Balamoote,

I have implemented some of your suggestions for the custom silence feature. In the next release version there will appear in the settings options for silence after line endings, sentences, questions and exclamations.

The choice for silence duration ranges from 100 ms to 1000 ms. The default value after line endings is 200 ms. This means, of course, that the value for paragraphs is, by default, 400 ms. As for pauses after sentences, questions and exclamations, the application is set to defer to the TTS engine as before (to avoid confusion upon upgrade).

I will note that the logic for inserting silence in place of line endings only looks for line feed (LF) characters, not carriage returns (CR). The line endings typically used by Windows (CRLF) are handled correctly, however. In addition, the logic for inserting silence after sentences, questions and exclamations checks for the ASCII (.?!) and the "Halfwidth" / "Fullwidth" forms (.?!) of characters — the latter are used in Japanese text, for example.

I thought about adding an option for inserting custom silence between each and every word, as you suggested, but decided against it. It is not technically incompatible with the other options, but, if you will permit me, I think it is a bad idea. Such a feature would take away the word context from the TTS engine, resulting in text/files being read in a very robotic manner. I do not wish to take this project in that direction.

A new release of TTS Util will be out in the next few days, first on the GitHub releases page, then on F-Droid, after which I will close this issue.

Balamoote commented 2 years ago

Hello Danesprite,

Thank you very much! Very good news indeed. I'll look forwards for the new release!

I agree with you on the pauses between words.

drmfinlay commented 2 years ago

This feature is now available in TTS Util version 4.0.0. See the releases page. The new version will be available on the Google Play and F-Droid stores soon.

Sorry this took me longer than a few days, @Balamoote.