daisy / pipeline-ui

A user interface for the DAISY Pipeline 2
MIT License
6 stars 2 forks source link

Ability to configure TTS engines in the UI #151

Open bertfrees opened 1 year ago

bertfrees commented 1 year ago
ways2read commented 1 year ago

I guess "the cloud based engines always work" isn't true if there is no internet connection.

bertfrees commented 1 year ago

Some idea's from yesterday's team meeting:

@rdeltour:

One thing that c/b nice is to have a way to have a live preview of the TTS from a config page in the UI.

@bertfrees: (see https://github.com/daisy/pipeline-modules/issues/66)

I want to eliminate the "Text-to-speech configuration file" options, and replace it with the following:

  • a dedicated option to specify CSS style sheets (in addition to the possibility to attach style sheets to the input)
  • a dedicated option to specify lexicons (in addition to the possibility to attach lexicons to the input)
  • dedicated options for certain TTS properties
    • org.daisy.pipeline.tts.log: done
    • org.daisy.pipeline.tts.mp3.bitrate: to do
    • org.daisy.pipeline.tts.lame.cli.options: has been deprecated
  • it should not be possible anymore to set other TTS properties dynamically (per job) (note that org.daisy.pipeline.tts.host.protection has already been deprecated)
  • per-job voice configuration should be replaced by a system wide voice configuration
bertfrees commented 1 year ago

This issue is not quite fixed yet I think. https://github.com/daisy/pipeline-ui/pull/154 fixes only a part of it, namely selecting preferred voices.

marisademeglio commented 1 year ago

This issue is not quite fixed yet I think. #154 fixes only a part of it, namely selecting preferred voices.

Can you create more issues for what still needs to be done? Or is it more of keeping everyone's ideas around?

bertfrees commented 1 year ago

It is more a collection of ideas. We can create some new issues with concrete things to do.

bertfrees commented 1 year ago

On second thought, the first comment in this issue sums it up pretty well I think. Instead of creating a new issue with more or less the same in it, I'm gonna reword this one, and convert it into a list of tasks.

marisademeglio commented 1 year ago

Will there be an API to describe engines' properties? Or should I hardcode it based on the engine configuration docs?

Voice preview will come from the API too, right? Is that ready?

bertfrees commented 1 year ago

I think hardcoding the properties makes the most sense for now. But an API for the engines can definitely be useful too. Let's keep the idea.

Voice preview will come from the API, yes. I'm not sure yet what the API should look like though. I guess it could also be a general purpose "speak" command, that could even accept SSML. That wouldn't be so hard to do. (A while back I already wrote a mock of the Google TTS API that dispatches to the available TTS engines.)

marisademeglio commented 11 months ago

Added credential fields for Azure and Google voices in 4d97c87be89cf94428d999aeef0016ac28f0c63a

Verifying the credentials is a new issue: https://github.com/daisy/pipeline-ui/issues/164

From this convo we have now implemented all the engine settings for our current goal

marisademeglio commented 8 months ago

Noting here that we also got a feature request from a tester: "When selecting between voices, offer a preview."

marisademeglio commented 5 months ago

Is there anything left in the first task above that is still relevant? We have designed the settings dialog in a different way based on other convos about engine properties that we wanted to support.

And the "voice preview" task is still pending engine implementation.

bertfrees commented 5 months ago

Is there anything left in the first task above that is still relevant?

The way the settings dialog looks now is great!

It's very minor, but one thing that would be nice is if the status (connected or disconnected) would somehow be made even more clear. I don't know how though.

By the way, this note at the top:

After configuring these engines with the required credentials, they will be available under 'Voices'. Save and reopen the settings dialog to see changes.

Is it really needed? It seems the voices are updated without closing and reopening the settings.

marisademeglio commented 5 months ago

True, that wording can be simplified as the changes now are effective immediately.

How is this for a slightly clearer connected/disconnected status?

Screenshot 2024-04-17 at 09 35 56
bertfrees commented 5 months ago

Yes, I also thought of emphasizing it visually like that. That is indeed slightly better, however I don't know whether that fundamentally changes anything? 'Cause it will just be decoration, right?

Perhaps the engines could be grouped by connection status? There would be two main headings with the connection status, under which the subheadings "Azure" and "Google" would go. The main headings wouldn't need to be visible for sighted users, the sections could be indicated some other way.

Just thinking out loud. As I said, it is already good the way it is now.

marisademeglio commented 5 months ago

Ok I will commit this since it seems better.

We can keep this thread going for ideas. I don't like grouping it by connection status because then doesn't it get reordered when the status changes? That's visually disruptive and probably bad accessibility.

bertfrees commented 5 months ago

Yes, that's probably true.

marisademeglio commented 3 months ago

Voice preview team discussion:

https://daisy-dev.slack.com/archives/C064GB8U9/p1719243930802729 https://daisy-dev.slack.com/archives/C064GB8U9/p1719244054736299