w3c / mediacapture-main

Media Capture and Streams specification (aka getUserMedia)
https://w3c.github.io/mediacapture-main/
Other
121 stars 60 forks source link

'label' needs direction and language metadata #665

Open xfq opened 4 years ago

xfq commented 4 years ago

Moving from https://www.w3.org/Mail/flatten/index?subject=i18n-ISSUE-464&list=public-media-capture :


The 'label' value is described as follows:

User Agents MAY label audio and video sources (e.g., "Internal microphone" or "External USB Webcam"). The MediaStreamTrack.label attribute MUST return the label of the object's corresponding source, if any. If the corresponding source has or had no label, the attribute MUST instead return the empty string.

Since the value is intended to contain natural language text, probably for consumption/display to the end-user, maybe it should be possible to determine or set the language (@lang) and base direction (@dir) of the text. This will allow the text to be displayed properly in different contexts.

In addition, it may be useful to allow multiple labels in different languages (although generally the source's label is applied by the user's user-agent, and so will be appropriately localized??)

dontcallmedom commented 4 years ago

cf https://github.com/w3c/mediacapture-screen-share/issues/135#issuecomment-579713729

xfq commented 4 years ago

In that issue you mentioned "the localizability of DOM error messages (assuming it is needed) would need to be handled at the platform level, not at the individual API level", but this issue is about audio/video source labels, rather than error messages. Do you mean the labels should also be handled at the platform level?

dontcallmedom commented 4 years ago

sorry, I was confused. There is still something platform-bound here, the localization of JS strings - can you share more about the status of adoption of the Localizable pattern in WebIDL specs?

(my recollection from previous discussions on this is that OS themselves don't really provide localization information on the underlying device information, but I'll need to dig this back from our 2015 discussions)

dontcallmedom commented 4 years ago

here is the analysis we had made back then (May 26 2015 teleconf):

This issue is thorny. The identifiers returned as "label" in present implementations are mainly from underlying drivers, and it seems impossible for an UA to reliably provide translations of these labels. A simple fallback seems like it's needed. Since the mediaDevices.enumerateDevices call is only defined on Navigator, we ask whether it's possible to state that the @lang attribute is window.navigator.language, and whether it's appropriate to determine @dir via the same method

here is the conclusion we had come to:

The Working Group feels the topic of localizing human readable strings in JavaScript needs to be solved at the platform level rather than in this particular specification. https://www.w3.org/2016/03/getusermedia-wide-review.html

I understand there were also unsuccessful attempts to get more precise guidance from the I18N WG.

aphillips commented 2 years ago

The I18N WG, in our teleconference of 2021-09-30, as part of reviewing whether we were satisfied with MC&S prior to transition, found that we are not satisfied by the resolution of this item. JavaScript is not responsible for providing a data type specifically for natural language text values. While I18N's efforts to get formally defined types in WebIDL and other infrastructural specifications is an on-going effort (and it would be helpful to have the support of groups such as media streams in making progress here), that doesn't remove the internationalization problem from this specification. Please revisit this issue.

alvestrand commented 2 years ago

Did we ever get a response on the question of whether window.navigator.language is an appropriate value for @lang?

aphillips commented 2 years ago

@alvestrand I expect you mean the discussion in @dontcallmedom's comment. If no other information is available, window.navigator.language (as a proxy for the system locale) is an appropriate fallback.

Imputing @dir from a language is not recommended as the primary means of supplying direction metadata, but also can be used in cases where metadata is not available. For your specification, you should probably provide a three stage process: (i) the caller supplies the value; (ii) first strong detection on the display string; (iii) imputed from the language tag.

dontcallmedom commented 2 years ago

for clarity, the labels are NOT coming from the API caller, but from the browser and the operating system

aphillips commented 2 years ago

@dontcallmedom Yes, I know. The data comes from the browser and/or operating system (or from API calls that provider does, such as enumerating devices). Perhaps I should have said "implementation" instead of "caller".

alvestrand commented 2 years ago

The distinction between "caller" and "implementation" is critical here, because it turns the question on its head; if we assume that the platform knows the language and direction of these attributes, the question becomes how the platform can expose these attributes to the caller.

If these were fundamental attributes of DOMString, or of an equivalent subclass type, this problem would have been solved by simply using that string type; end of story.

It seems that adding a sentence saying that "the language of this attribute MUST be consistent with window.navigator.language" would be the current best approach to providing a @lang attribute for the string - but I don't see a similar interface that is appropriate for reporting the @dir attribute.

To my mind, this is a platform problem, not a WebRTC problem. Which is what the May 26 2015 teleconference minutes were intended to indicate.

Do we have a solution we can reference, or is there no such solution?

dontcallmedom commented 2 years ago

I think the solution would be to make labels use something like the Localizable dictionary instead of a DOMString ("something" in the sense that it would be best made an interface with a stringifier method that defaults to the value attribute), with the dir and lang attributes set by the browser based on:

alvestrand commented 2 years ago

Breaking compatibility with existing use of "label" is a no-no, so whatever we do, acting as if "label" was a DOMString must continue to work.

Is there an existing example of a spec that uses Localizable with a stringifier in this way?

I see that HTML's "VideoTrack" construct (https://html.spec.whatwg.org/multipage/media.html#videotrack) has "label" and "lang" as separate attributes, but no "dir" attribute. Is there an example of a spec that uses Localizable with a "dir" attribute at all?

(As usual, I am very hesitant to committing the WebRTC WG to breaking new ground in areas outside the WG's competence.)

dontcallmedom commented 2 years ago

the point of using the stringifier is precisely to avoid breaking backwards compatibility - any code that uses label as a string would still have it behave as a string.

The only example of IDL interface that I could find in WebIDLpedia using dir and lang in combination (but not using the stringifier trick) is https://notifications.spec.whatwg.org/#dictdef-notificationoptions

dontcallmedom commented 2 years ago

the Localizable dictionary itself is only defined as part as an example, so I'm not sure it is meant to be used as is by specifications. It isn't used by any at this stage, in any case.

alvestrand commented 2 years ago

That's the problem with examples given as recommendations - it's impossible to track down whether they are actually used in practice.

If I read https://heycam.github.io/webidl/#idl-stringifiers correctly, this construct:

dictionary LocalizableWithStringDefault { stringifier DOMString value; DOMString lang; DOMString dir; }

would do the trick.

@aphillips would such a construct satisfy your concerns?

Given that we would still need 2 browsers to implement this feature in order to get MEDIACAPTURE-MAIN to REC status, I am not committing to the idea; I am asking whether that would satisfy the concerns raised.

dontcallmedom commented 2 years ago

stringifier can only exists on interfaces, and the direction attribute can be constrainted in terms of values, so it would be:

interface LocalizableWithStringDefault {
  stringifier attribute DOMString value;
  attribute DOMString lang;
  attribute TextDirection dir = "auto";
};
enum TextDirection {    // or we could re-use the already defined NotificationDirection enum
    "auto",
    "ltr",    
    "rtl"
};

(concretely, the interface would to be named something more descriptive, e.g. MediaDeviceLabel; alternatively, we could try and upstream a LocalizableString to WebIDL - paging @w3c/webidl-design to get a sense of feasability).

annevk commented 2 years ago

See https://github.com/heycam/webidl/issues/1025 though this seems like a different scenario as the values are coming from the OS?

plehegar commented 2 years ago

See also https://github.com/w3ctag/design-reviews/issues/716

dontcallmedom commented 1 year ago

Reporting my current understanding of the situation for this issue:

As a result, my sense is that it's likely the Working Group would request to move forward with the current unlocalizable DOMString attribute until there is both a technical solution provided in EcmaScript and a clearer path for implementations to surface usable data.

With that being said, I note a new proposal in the context of label for audio output https://github.com/w3c/mediacapture-output/issues/133 that may make localizability more clearly needed in the short term.

annevk commented 1 year ago

As long as lang and dir dictionary members can be added at a later time I think that's fine. See also https://github.com/whatwg/webidl/issues/1025#issuecomment-936120400. I don't think we should be adding a new string type for this to ECMAScript (the organization is Ecma now, but the language uses the old spelling still).

aphillips commented 1 year ago

Operating/windowing systems provide a way to set the base direction and language. Generally these default from the user's locale. See for example MacOS Java Windows Android etc.

Creating examples of broken display is trivial to do if you use some of the examples in String-Meta. I have a demo HTML page that uses Javascript to show our examples live (to address Domenic's comment on our TAG issue) which I will try to post in the next few days.

I'd prefer if we added attributes now rather than later--because I suspect that "later" you'll want to use the backwards compatibility argument for not adding them 😉.

annevk commented 1 year ago

If they cannot be added in a compatible manner I would agree with you that there's a problem. But I suspect that's not the case? Adding dictionary members is pretty much free.

alvestrand commented 1 year ago

I think the current proposal is to replace DOMString in the API with an object having 3 attributes (the string, "lang" and "dir") and a stringifier for backwards compatibility with users that expect a DOMString, but I'm not quite sure it hasn't been replaced with something else.

dontcallmedom commented 1 year ago

The group hasn't been able to come up with a real-life situation where the lack of direction/language metadata would prevent to represent information coming from the operating system (in particular because Operating Systems don't seem to be surfacing these metadata in the context of device names).

Since there is a path for a backwards compatible change if/when that problem can be demonstrated, my suggestion would be still to close this issue without further spec change at this stage.

aphillips commented 1 year ago

Operating systems will not supply the metadata as directly associated with display strings, but they do have APIs for finding out the current display locale (used to look up the strings from resources) and the directionality of that locale. Plenty of display strings for devices contain numbers and/or Latin text that can be mixed with RTL text or can be in a language that needs help in (for example) font selection (such as Japanese vs. T.Chinese vs. S.Chinese).

dontcallmedom commented 1 year ago

@aphillips I've struggled to find examples of device names that would be surfaced by getUserMedia in other than ASCII and English; I thus haven't been able to test if OSes current locale would or would not work to annotate these names. Given how hard it seems to even construct an example with real-life devices, let alone explore whether OS provide relevant information for them, I still don't see how browsers could surface relevant information.

fippo commented 1 year ago

The only examples I can think of are localizations of devices that include the users name, e.g. Philipp's herrausragende Ohrstöpsel. However these are subject to sanitation anyway.

aphillips commented 1 year ago

@fippo why would the localization need to include the user's name? Ohrstöpsel or سدادة أذن or ear plug or 耳塞 are potential localized device names that can need language and direction information. Mixed direction names would occur when, for example, the devices are numbered (more than one is attached) or when more information (screen resolution, kHz, etc.) are included.

dontcallmedom commented 1 year ago

@aphillips I don't think there is any disagreement that there would be theoretical good reasons where language and direction information may be needed in device labels; but rather that there aren't any real situations where they are that we could determine, and designing a solution to a theoretical problem is unlikely to yield good results.