w3ctag / design-reviews

W3C specs and API reviews
Creative Commons Zero v1.0 Universal
332 stars 56 forks source link

I18N String-Meta and WebIDL #716

Closed aphillips closed 2 years ago

aphillips commented 2 years ago

صباح الخير TAG!

I'm requesting the TAG express an opinion on a "dispute" related to:

Explanation of the issue that we'd like the TAG's opinion on:

This isn't quite a "normal" technical dispute, but we do seek a conversation with TAG about the technical approach we are taking. We believe that interoperability of natural language strings between different Web APIs is strongly desirable

Quoting our explainer:

We would like TAG to review our approach to this problem and discuss what the right long term approach should be in the Web platform. We believe that this is an important gap for natural language support on the Web; but we are concerned that our current approach and comments generates churn or is distracting to Working Groups attempting to complete work on specifications.

Our immediate request has to do with webidl#1025 wherein we requested that WebIDL add a Localizable type to IDL. This would allow specifications to reference this string type and save them creating a local dictionary representation. The WebIDL folks do not want to do this because it is at odds with their normal practice of providing only JavaScript primitives and types. They also don't want to become a registry of random dictionary entries.

One way to solve this would be if W3C and ECMA-402 proposed a natural language string type with these attributes to ECMA TC39. If that proposal were ultimately successful (and it will take at least one complete JavaScript release cycle to be accepted and reach the specification), then WebIDL could encode the type in their specification. This would be the most durable and platform-wide solution. On the down side, this would require probably 1-3 years before specifications would have a ready reference and it is unclear if such a type would be accepted or implemented by TC39.

Another alternative, possibly acting as a shim for eventual standardization by ECMA TC39, would be for I18N to define a dictionary and ask specifications to adopt it generally for natural language string values.

Links to the positions of each side in the dispute (e.g., specific github comments):

webidl#1025

What steps have already been taken to come to an agreement:

We don't actually disagree with WebIDL. Some working groups have pushed back on our comments asking for direction metadata because of the lack of a standardized representation on the Web, such as Webauthn and Web Payments.

We'd prefer the TAG provide feedback as (please select one):

For our own housekeeping: [I18N-ACTION-1103]

Please preview the issue and check that the links work before submitting. In particular, if anything links to a URL which requires authentication (e.g. Google document), please make sure anyone with the link can access the document.

¹ For background, see our explanation of how to write a good explainer.

domenic commented 2 years ago

(The links to https://github.com/whatwg/webidl/issues/1025 are broken in the OP.)

For the TAG's consideration, I think the discussion here is a bit deeper than the procedural question of where a specific dictionary should live. The Web IDL thread brings up an important question which seem more foundational to me:

Is it appropriate for specifications to aspirationally include direction/language metadata in user-facing string APIs, even if no implementer has any plans to use that metadata? See discussion starting https://github.com/whatwg/webidl/issues/1025#issuecomment-934150460 .

I don't have strong feelings about the specific shape of localized-string APIs, whether based on ECMAScript, a shared dictionary defined somewhere, or spec-specific dictionaries. But I do have strong feelings that adding aspirational features without any implementation commitment is not good, even if the aspiration is in the direction of a good cause like proper i18n support.


I've also found it persistently frustrating how the i18n folks have been unable to supply many concrete examples of the APIs they'd like to use this infrastructure. In particular, I think the category is: JavaScript APIs, which accept strings for presentation to the user (not developer), and do not involve HTML markup at all. We intentionally try to avoid such APIs on the web platform, preferring markup for user-presentation, but sometimes it's unavoidable as we need to present strings in "browser chrome" or similar.

My reading of the thread has been that we eventually settled on there being one or maybe two such APIs on the web platform (PaymentRequest and WebAuthentication), plus Notifications which already has its own solution. The thread also had lots of confusion about geolocation and developer-facing error message localization.

I would love to see the explainer include example code (as the "good explainers" document and the template in the OP suggests), of each of these specific APIs, before/after the proposal.

plehegar commented 2 years ago

fyi, the i18n are tracking several direction and language metadata issues.

aphillips commented 2 years ago

I18N has been requesting metadata for some time and there are different dispositions depending on the specification and its needs. There are 18 specifications in the list below. Separately I am reviewing specs that preceded our efforts in this area as well as the list of potential new specifications.

First, JSON-LD added features to the specification allowing document and item level metadata.

The following specifications either added support using JSON-LD or are in the process of doing so:

Some specifications added locally defined metadata (i.e. their own language and direction fields in natural language values and which are similar to the Localizable type described elsewhere in this request):

One specification (WebAuthn) defined its own serialization scheme; we are still working with them on the details of that scheme.

The following specs are in one way or another waiting on this discussion or have proceeded:

SHACL probably should have adopted the JSON-LD approach. micro-pub entangles metadata with the question of localizable error messages and so might not apply.

domenic commented 2 years ago

Thanks for assembling the list. Just to reemphasize:

I would love to see the explainer include example code (as the "good explainers" document and the template in the OP suggests), of each of these specific APIs, before/after the proposal.

aphillips commented 2 years ago

@domenic thanks: I'm also working on addressing that part of your comment.

alvestrand commented 2 years ago

FWIW, WebRTC (device labels) is also in the group of "having pushed back on the request because of no developer interest and no clear pattern to follow". https://github.com/w3c/mediacapture-main/issues/665

hadleybeeman commented 2 years ago

Thanks for this question, @aphillips! We've discussed it in our W3C TAG breakout today.

We think that it's important that the approach be guided by developer ergonomics, who will be the primary users of this. It's important to note that the data and metadata are closely bound and the shape of this should reflect that. And, ideally, we think that it should follow from TC39 consensus on what the right approach is and then adjust WebIDL accordingly.

Is there anything we can do to help with this, or do you have what you need to crack on? Let us know.

hadleybeeman commented 2 years ago

Hi, we're just revisiting this. @xfq @r12a, I know that @ylafon has spoken to you about this over the past week. If you or @aphillips don't have any final comments in the next week, we are minded to close this. Let us know if we can do anything else to help.

aphillips commented 2 years ago

@hadleybeeman Thanks for the update. We discussed this in our teleconference yesterday (2022-07-14).

Unsurprisingly our next step is to further engage ECMA-402 (the I18N part of TC39) and together engage TC39 about encoding natural language strings with appropriate metadata. One of our goals in opening this issue was to get support (where appropriate) from TAG--probably in the form of "hey, we think this issue is worth paying attention to". Is it reasonable to expect such support? It's hard to tell what TAG's position is from the comments.

Also, does TAG have any recommendations for who to approach at TC39? We can just use the I18N folks, but perhaps a more direct engagement would be better. What can you suggest?

cynthia commented 2 years ago

@littledan is this something you could potentially help with?

hadleybeeman commented 2 years ago

To clarify, we the TAG do think this issue is worth paying attention to and hope it gets the focus it needs.

hadleybeeman commented 2 years ago

We are closing this, since it seems resolved. Please do leave a comment or send us an email when you've established contact with TC39. We are hopeful that the intro to @littledan (above) would help.

alvestrand commented 2 years ago

So is the resolution of this issue a recommendation that WEBRTC and other WGs that are awaiting guidance should do nothing until ECMA-402 has finished engaging with TC39 to find a language-appropriate solution?

aphillips commented 2 years ago

Per an action item, updating this issue.

I had a meeting with ECMA-402 (the I18N subcommittee of TC39) on 2022-08-11 and we plan to have a follow up at their next call. In addition, I am reaching out to TC39 in order to get the ball rolling there as well. Generally speaking the 402 folks are supportive, but needed some time to digest our proposal.

@alvestrand RTC and other groups can make cautious progress: in some cases guidance in String-Meta can be followed now. However, in the main, the folks who were waiting before are still waiting. I hope to have at least initial progress to report with TC39 before TPAC and I will add links here for those who need to follow along or who wish to engage in that conversation as they develop.

domenic commented 2 years ago

How is progress on https://github.com/w3ctag/design-reviews/issues/716#issuecomment-1079145578 ?

littledan commented 1 year ago

The motivation for associating a string with a language seems clear, but what's less clear to me is which APIs this should be modified to accept (or produce) such a value. So, I'm looking forward to the answer to the question @domenic asked.

In general, if we have use cases in JavaScript that would benefit from this feature (whether through direct use or as input to another JS built-in method), I'm not opposed to adding it as a built-in class in ECMA-402.

I get the feeling that the motivation for putting the string-with-metadata in WebIDL has to do with ensuring that it's maximally available for use in other specifications. Would this not also be met by putting this dictionary definition in a third specification, and making normative references to it?

(Apologies for my delay on responding to this thread. Addison and I are in touch by email and I hope to have a call with him soon.)