whatwg / webidl

Web IDL Standard
https://webidl.spec.whatwg.org/
Other
406 stars 162 forks source link

Extending an interface with boolean attribute with default value false #880

Open guest271314 opened 4 years ago

guest271314 commented 4 years ago

Web Speech API mentions SSML (Speech Synthesis Markup Language) at text attribute

https://wicg.github.io/speech-api/#dom-speechsynthesisutterance-text

text attribute, of type DOMString This attribute specifies the text to be synthesized and spoken for this utterance. This may be either plain text or a complete, well-formed SSML document. [SSML] For speech synthesis engines that do not support SSML, or only support certain tags, the user agent or speech engine must strip away the tags they do not support and speak the text. There may be a maximum length of the text, it may be limited to 32,767 characters.

The specification does not provide a clear description of how to signal to the speech synthesis processing application whether text passed to SpeechSynthesisUtterance instance or set at or text attribute is SSML or text that does not intended to be interpreted and parsed as a markup language https://github.com/WICG/speech-api/issues/10.

Existing speech synthesis engines, and processing interfaces have infrastructure in place to parse SSML https://github.com/brailcom/speechd/issues/301#issuecomment-623228102 within the scope of the language of the existing Web Speech API specification, save for the lack of clear procedural steps to implement signaling to the engine to process text as SSML. After filing mutliple issues, including one where the patch to turn of SSML processing on for speech-dispatcher (used by Chromium and Firefox for a speech engine interface) sat, and still sits unpatched, decided to implement SSML parsing algorithm, as described in official specification, in JavaScript https://github.com/guest271314/SSMLParser, to demonstrate the request to implement SSML parsing is possible to achieve, and the reason for non-implementation of the specification by browsers is not a technical issue.

The specification states text can be SSML, though provides no means to signal the speech synthesis engine directly or through an interface to interpret and parse text attribute input as SSML.

The simplest fix that have conceived of so far to have specification language and IDL in place for when the specification authors decide to address the issue is to define a ssml attribute on SpeechSynthesisUtterance.

Hereafter all of the required elements of implementation will be prepared to be written in a specification in some form and simply turned on in browser source code.

https://wicg.github.io/speech-api/#speechsynthesisutterance

[Exposed=Window]
interface SpeechSynthesisUtterance : EventTarget {
    constructor
(optional DOMString text
);

    attribute DOMString text;
    attribute DOMString lang;
    attribute SpeechSynthesisVoice? voice;
    attribute float volume;
    attribute float rate;
    attribute float pitch;

The addition being a boolean type with default value set to false indicating text is not expected to interpreted as SSML without ssml attribute set to true.

    attribute boolean ssml = false

The language which would described the ssml attribute

ssml attribute, of type boolean This attribute, if true, signals to the speech synthesis engine to interptet text attribute as SSML, default value false.

Are the attribute extension in Web IDL language correct and consistent with the language defining the attribute?

TimothyGu commented 4 years ago

Attributes can't have default values in Web IDL. The customary way to specify something like this is to have the SpeechSynthesisUtterance object possess a "SSML flag" internal slot that defaults to false when an object is first created. The getter steps of the ssml attribute would be to return the value of this flag, and the setter steps would be to set the "SSML flag".

That being said, the API design looks pretty odd. In particular, forcing users to modify the ssml and text attributes separately could contribute to a temporary state of inconsistency between ssml and text, which IMO would be poor design. (E.g., text has been set to SSML text while ssml is still false.) In response to this, I'd propose making ssml settable through the constructor but immutable otherwise. In fact, if I were to design the API, I'd make all of these properties (text, volume, etc.) readonly; but if that's no longer possible, making ssml immutable and only settable through the constructor is the next best thing.

guest271314 commented 4 years ago

@TimothyGu

That being said, the API design looks pretty odd. In particular, forcing users to modify the ssml and text attributes separately could contribute to a temporary state of inconsistency between ssml and text, which IMO would be poor design. (E.g., text has been set to SSML text while ssml is still false.)

This is what we have. From the specification

This may be either plain text or a complete, well-formed SSML document.

No language discloses how to disambiguate between the two potential inputs.

In response to this, I'd propose making ssml settable through the constructor but immutable otherwise. In fact, if I were to design the API, I'd make all of these properties (text, volume, etc.) readonly; but if that's no longer possible, making ssml immutable and only settable through the constructor is the next best thing.

How does that look in "Web IDL" verbage?

guest271314 commented 4 years ago

speechSynthesis.speak(new SpeechSynthesisUtterance('<p>hello<break>universe</p>', {ssmlMode: true})) // defined at SpeechSynthesisUtterance

or

speechSynthesis.speak(new SpeechSynthesisUtterance('<p>hello<break>universe</p>'), {ssmlMode: true}) // defined at speak

not defined when trying to set after constructor?

const u = new SpeechSynthesisUtterance();
u.text =  '<p>hello<break>universe</p>';
u.ssmlMode = true;