Closed awoie closed 1 year ago
Requested review from Internationalization Working Group here https://github.com/w3c/i18n-request/issues/212
Seems blocked pending a completion of review, and all issues related to review have been filed.
@aphillips is your review complete? Have we addressed all your concerns?
@OR13
This issue is your self-review (thank you for providing it). The I18N review is tracked in our review radar and open issues can be found in the horizontal review board:
As I write this, VCDM has 5 open issues (including this issue). If you feel you have completed your self-review, you can close this issue. You may close other issues if you feel they have been addressed. I18N evaluates our "mirrored" issues periodically and our closing of those issues is what allows you to transition without questions being raised.
Looking at the open issue list, I have just marked w3c/i18n-activity#650 for "close", as you have addressed it. I think we are still awaiting some reply to my most recent comments on the language example thread and the remainder of the issues (other than this self-review) are related to that. Our WG has included the topic of @context
in our agenda for our weekly teleconference tomorrow (2023-08-31): we may or may not be satisfied as a result of that conversation. At the moment, I am not satisfied, since I think that name
and description
without language and direction metadata wasn't our goal, but don't hold me to that pending our internal discussion!!
The issue was discussed in a meeting on 2023-08-30
@msporny,
You said:
Manu Sporny: yes, we had that and people complained, so we removed it. … so maybe we could add one that has the bare https://github.com/language or https://github.com/value in there.
Would you please kindly let me know where I can find these complaints?
@shigeya wrote:
Would you please kindly let me know where I can find these complaints?
It was spread across various WG calls and it was years/months ago, I can't remember which calls exactly. The gist of it was that people didn't like doing this:
"property": {
"@value": "The value",
"@language": "en-US"
}
and while we suggested that people do this instead, and set their contexts up to do so:
"property": {
"value": "The value",
"lang": "en-US"
}
Which is what we do in the current spec, people are skeptical that developers are going to do this too... mostly because VCs to date demonstrate that people are not doing /any/ i18n encoding.
We could do @language
in the context, but that won't work for the people that don't want to process the @context
beyond just checking a few values. We /could/ say that people doing processing MUST pay attention to @language
, but then most programming environments don't have support for i18n (think JSON-only processing environments). The other problem with @language
in the @context
is that it could accidentally set the language for strings that are not meant to have a language attached to them (like all the JWT properties).
Another option is to introduce a language
property to the VC that expresses that "all human-readable strings SHOULD default to this language".
I was hoping that you, @shigeya, would propose your translation strings file concept for inclusion into VC v2.0 using .po
files or similar. However, since that hasn't been done, and since we're transitioning into CR very soon now, it's probably too late to add it at this point w/o the WG buying into the concept/idea.
IOW, we can support i18n as long as implementers are careful w/ the creation of JSON-LD Context files... but that requires effort that some implementers are not willing to make.
There are probably other options... we are 8-ish months away from Proposed Recommendation... we need to either go with the solution we have now and see how the market uses it, or try a different option. Having multiple options is going to harm interop... looking for i18n for guidance on this topic.
Thank you very much for providing the concise summary.
Another option is to introduce a language property to the VC that expresses that "all human-readable strings SHOULD default to this language".
I think that's a reasonable compromise.
If we introduce that option, the language
property needs a default value (en-US
?). Also, we need the "dir" property with a default. Then, developers not concerned about language may forget about the property, and some developers who want to specify the language may use the property.
we need to either go with the solution we have now and see how the market uses it
(With my W3C hat down) I would think this is what we should do. This means that, in many cases, the language will be undefined; in my understanding, this is also the case with HTML. I feel uneasy reinventing a new mechanisms with defaults.
Which is what we do in the current spec, people are skeptical that developers are going to do this too... mostly because VCs to date demonstrate that people are not doing /any/ i18n encoding.
What infos do we really have on this? Are we relying on implementation dominated by US (or US and UK) implementers? I would be surprised if European implementers would be just as insensitive to language issues.
I feel uneasy reinventing a new mechanisms with defaults.
The problem is what we discussing here is the integrity of data.
IMO, data integrity will be difficult to achieve without carefully designed defaults.
If we introduce that option, the language property needs a default value (en-US ?).
Do not default to US English. If the language is not known, make it unknown. There is a tag for that (und
) and a corresponding locale in CLDR (the "ROOT" locale).
Another option is to introduce a language property to the VC that expresses that "all human-readable strings SHOULD default to this language".
I think that's a reasonable compromise.
I tend to agree.
I would commend this group back to our document STRING-META. Pay particular attention to the discussion of syntactic content and the best practices in resource wide defaults.
Thought: isn't this what @language
in @context
already does? "Normal" JSON processors still see these values. They just don't apply JSON-LD's processing rules.
We could do @language in the context, but that won't work for the people that don't want to process the @context beyond just checking a few values.
Perhaps this could be solved by adopting a different definition of what "working" means? I think I18N would be satisfied if it were possible for a consumer of a credential to recover the language and direction of natural language strings if they desired to do so, but not to require that every implementation do the recovery every time. It is easier to lobby for adoption of proper internationalization when the data is present.
I wouldn't oppose creating a new mechanism, if that will be more effective. I agree that the "pure JSON" case is real and understand why you desire not to require full JSON-LD. But I am leery of having lots of different standards using incompatible mechanisms to solve the same problem.
"Seeing how the market uses it" is not really a persuasive argument to me. Many implementers will be lazy about adopting features, particularly if they don't directly effect the producer. This is like arguing 35 years ago that "I don't need Unicode for my English because I'll never use all those extra bytes" 😸.
The issue was discussed in a meeting on 2023-09-14
@aphillips would you and other Internationalization WG members be able to join VC WG to discuss this issue (and also its relationship with Issue #1264) during a VC WG's special topic call next week Tue? call details are here: https://www.w3.org/events/meetings/f6342df0-f7b5-4fc9-babd-61e55dc5fc2f/20231003T110000/
I would be glad to attend. Please add me to the invite. I will ask others in the WG in our teleconference on 2023-09-28.
The issue was discussed in a meeting on 2023-09-26
I would be glad to attend. Please add me to the invite. I will ask others in the WG in our teleconference on 2023-09-28.
done! is tomorrow still good for others from your WG to attend?
@Sakurann Yes, we're good.
The issue was discussed in a meeting on 2023-10-03
Before this issue is closed can you point to the PR or Issue that is dealing with the default I18N for a VC, as I could not see it
The issue was discussed in a meeting on 2023-10-18
Noting this comment in your meeting minutes:
Manu Sporny: I don't recall when we can close horizontal review issues.
Your working group may close horizontal issues when you feel you have addressed them--they are "just issues" and don't require special treatment. Please do not remove the *-needs-resolution
or *-tracker
labels (these are consumed by tools). Your closing an issue is a signal to the horizontal group that you think you're done with it.
Note that we have tracking issues in our own repository (in I18N's case this is here) and you can view the status of your horizontal review in this tracker (filter it by your spec(s)). I18N needs to close any of our matching issues before your transreq will be approved.
Before this issue is closed can you point to the PR or Issue that is dealing with the default I18N for a VC, as I could not see it
It's being tracked in Issue #1264, specifically, this comment contains all the ways we're considering addressing the issues raised by i18n WG's review:
https://github.com/w3c/vc-data-model/issues/1264#issuecomment-1712807665
(edited to change PR #1264 to Issue #1264)
Reopening for proper minute processing
The issue was discussed in a meeting on 2023-11-01
The following is an Internationalization Review for "W3C Verifiable Credentials Data Model v2.0" (VCDM 2.0). The latest published version of the specification can be found here.
The specification is a JSON-LD data model specification for Verifiable Credentials and Verifiable Presentations.
All features can be internationalized using JSON-LD features. Specifically, the validity period for a Verifiable Credential is expressed using XML Schema 1.1 where the dates can be localized and made accessible given the nature of XML Schema 1.1 date time values.
The specification contains an Internationalization Considerations section that provides more details on how internationlization is achieved.
The following is a review based on the short i18n review checklist from here:
If the spec or its implementation contains any natural language text that will be read by a human (this includes error messages or other UI text, JSON strings, etc, etc), then ensure that there’s metadata about and support for basic things such as language and text direction. Also check the detailed guidance for Language and Text direction.
The specification is a JSON-LD data model specification and can use all internationalization features that JSON-LD offers. The Internationalization Consideration section specifically points out how to achieve supporting different languages, text direction and so on.
If the spec or its implementation allows content authors to produce typographically appealing text, either in its own right, or in association with graphics, then take into account the different typographic styles used around the world (for things such as line-breaking, text justification, emphasis or other text decorations, text selection and units, etc.). Also check the detailed guidance for Typographic support.
If the spec or its implementation allows the user to point into text, creates text fragments, concatenates text, allows the user to select or step through text (using a cursor or other methods), etc., then make allowances for the ways different scripts handle units of text. Also check the detailed guidance on Text-processing.
If the spec or its implementation allows searching or matching of text, including syntax and identifiers, then understand the implications of normalisation, case folding, etc. Also check the detailed guidance on Text-processing.
If the spec or its implementation sorts text, then ensure that it does so in locally relevant ways. Also check the detailed guidance on Text-processing.
If the spec or its implementation captures user input, then ensure that it also captures metadata about language and text direction, and that it accommodates locale-specific input methods.
If the spec or its implementation deals with time in any way that will be read by humans and/or crosses time zone boundaries, then ensure that it will represent time as expected in locales around the world, and manage the relationship between local and global/absolute time. Also check out guidance on Local dates, times and formats.
Dates and times are used when expressing the validity periods for Verfiable Credentials. For these fields, we use XML Schema 1.1 date-time format, see
validFrom
andvalidUntil
.If the spec or its implementation allows any character encoding other than UTF-8, then make sure you have a convincing argument as to why, and then ensure that the character encoding model is correct. Also check out detailed guidance on Characters.
All character encoding for the VCDM use UTF-8 for text encoding. The VCDM vocabulary is hosted by W3C and uses the UTF-8 encoding for its contents (default encoding on the web).
If the spec or its implementation defines markup, then ensure support for internationalisation features and avoid putting human-readable text in attribute values or plain-text elements. Also check out detailed guidance on Markup & syntax.
If the spec or its implementation deals with names, addresses, time & date formats, etc, then ensure that the model is flexible enough to cope with wide variations in format, levels of data, etc. Also checkout detailed guidance on Local dates, times and formats.
Dates and times are used when expressing the validity periods for Verifiable Credentials. For these fields, we use XML Schema 1.1 date-time format, see
validFrom
andvalidUntil
.Advanced features of the VCDM that define extension points such as
TermsOfUse
can have internationalization considerations but this is out of scope of this specification. It is expected that developers that define concrete extension points or extend the VCDM using the JSON-LD extension mechanism would write and implement their own internationlization considerations.However, since the specification is based on JSON-LD, all features of the VCDM as well as concrete extensions (even if defined outside of the specification) can be internationalized using JSON-LD using the mechanisms described in the Internationalization Consideration section.
If the spec or its implementation describes a format or data that is likely to need localisation, then ensure that there’s an approach in place which allows effective storage and labelling of, and access to localised alternatives for strings, text, images, etc.
Since the specification is based on JSON-LD, all features of the VCDM as well as concrete extensions (even if defined outside of the specification) can be internationalized using JSON-LD using the mechanisms described in the Internationalization Consideration section.
If the spec or its implementation makes any reference to or relies on any cultural norms, then ensure that it can be adapted to suit different cultural norms around the world (ranging from depictions of people or gestures, to expectations about gender roles, to approaches to work and life, etc).