w3c / web-annotation

Web Annotation Working Group repository, see README for links to specs
https://w3c.github.io/web-annotation/
Other
142 stars 30 forks source link

ed sugg: textDirection #348

Closed r12a closed 7 years ago

r12a commented 8 years ago

https://www.w3.org/TR/annotation-model/#model-1

textDirection definition

The direction of the text of the resource.

-> The overall base direction of the text of the resource.

auto definition

The direction that indicates the value of the resource is explicitly directionally isolated text, and the direction is to be programmatically determined using the value.

-> The direction that indicates the value of the resource is explicitly directionally isolated text, and the direction is to be programmatically determined using the value of the first strong character.

matial commented 8 years ago

On 16 Aug 2016 R12a proposed the following change in the defintion of "auto":

auto definition

The direction that indicates the value of the resource is explicitly directionally isolated text, and the direction is to be programmatically determined using the value.

->

The direction that indicates the value of the resource is explicitly directionally isolated text, and the direction is to be programmatically determined using the value of the first strong character.

I had some difficulty analyzing the parts of those sentences. I suggest what seems to me a clearer phrasing.

-> The property value indicating that the value of the resource is directionally isolated text, and that its direction is to be programmatically determined using the value of its first strong character.

I have removed the word "explicitly" since there is nothing explicit here except the specification of "auto" which is our subject, so "explicitly" does not add information, IMHO.

lkemmel commented 8 years ago

I suggest a minor change to Mati's wording:

The property value indicating that the value of the resource is directionally isolated text, and that its direction is to be programmatically determined using the value of its first strong character.

-> The property value indicating that the value of the resource is directionally isolated text, and that its direction is to be programmatically determined using the directional property of its first strong character.

iherman commented 8 years ago

@r12a, is this a proposal to, through these changes, closing issue #335?

r12a commented 8 years ago

No, it's just editorial clarifications to the current text.

tkanai commented 8 years ago

@r12a I'm afraid that the word "strong character" is not crystal clear for all, at least for me, and I think it would be helpful, if the text refers to Unicode TR 9. What do you think?

BigBlueHat commented 8 years ago

@r12a maybe we can just crib text from here? https://www.w3.org/TR/appmanifest/#dir-member

r12a commented 8 years ago

@tkanai i'm fine with referring to TR9. And i think you're right about 'strong character', perhaps we should say 'strong directional character', which is how Unicode Standard refers to them.

gsergiu commented 8 years ago

@r12a I would suggest further improvements to the text.

  1. Text direction: The base direction of the text in the resource. (Note ... base implies the menaing ground/general, while overall, would suggest the "only" text direction. I think there are also resources which might mix texts, like bilingual forms including fields with english + arabic names )
  2. Auto:

The property value indicating that the value of the resource is directionally isolated text, and that its direction is to be programmatically determined using the directional property of its first strong character.

The keyword indicating that the text of the resource is directionally isolated, and that its direction is to be programmatically determined accoding to the unicode standard: http://unicode.org/reports/tr9/

lkemmel commented 8 years ago

@gsergiu :

The keyword indicating that the text of the resource is directionally isolated, and that its direction is to be programmatically determined accoding to the unicode standard: http://unicode.org/reports/tr9/

I think this is not informative enough. Maybe: "... according to the Unicode TR9, rules http://unicode.org/reports/tr9/#P2 and http://unicode.org/reports/tr9/#P3".

gsergiu commented 8 years ago

@lkemmel I welcome further refinements (at least on my update proposals), especially because I'm not really an expert in this particular matter.

However ... I have to make a notice. The enhancement proposal made in this ticket, seems to be alligned with the html5 dir=auto element, which is a kind of more broader definition, given that html supports more charsets than json.
https://www.w3.org/International/questions/qa-html-dir#dirauto

Given this, I would say, that we don't really need to define how the auto direction is determined, we just need to reference the Unicode specifications that already describe it. (In this way we also avoid the need to explain what 'strong directional character' means)

lkemmel commented 8 years ago

Text direction: The base direction of the text in the resource. (Note ... base implies the menaing ground/general, while overall, would suggest the "only" text direction. I think there are also resources which might mix texts, like bilingual forms including fields with english + arabic names )

I agree that "overall" may not be required here, although each paragraph (as opposed to characters forming its content) really has the "only" direction. Base direction stands for the paragraph direction / level : http://unicode.org/reports/tr9/#The_Paragraph_Level - and apply to either unidirectional or bidirectional content.

gsergiu commented 8 years ago

@lkemmel Well .. that is in unicode, we are talking here about the base/default text direction to be used mainly for external resources. By saying this ... I realize that my previous afirmation is not correct!

Given this, I would say, that we don't really need to define how the auto direction is determined, we just need to reference the Unicode specifications that already describe it.

In fact the textDirection doesn't reffer to unicode texts only .... However, it is also not reffering to html code only. (The resources might be any text document, or multimedia content, however for the later thetext direction might be irelevant.)

But what do we do with application/* content type? Should we recommender to use a different value than the auto? Or ... this is this property relevant exclusively for text/* content types?

lkemmel commented 8 years ago

@gsergiu The UBA also applies to rich Unicode text or plain/rich non-Unicode text (with some limitations that stem from insufficient coverage, e.g. inability to support directional embeddings, overrides or isolates).

Regarding media types other that "text", in my humble opinion, 'textDirection' may always appear in ANNOTATIONS, but would apply to "TARGETS" with certain media types only (probably still not limited just to "text").

Should we recommend to use a different value than the auto?

I do not see the spec recommends anywhere to default to 'auto'. It states though: "The notion of text direction is taken explicitly from the HTML5 [html5] dir attribute": https://www.w3.org/TR/html5/dom.html#the-dir-attribute." And "The Body or Target MAY have exactly 1 textDirection associated with it", where "MAY" (as opposed to "MUST") suggests the default is "null", and the text direction is decided by an external resource exclusively.

azaroth42 commented 8 years ago

The default is "unknown" because of RDF's open world assumption. And, with regrets, the definitions should not refer directly to unicode requirements, as the resources may not be in unicode, although that would clearly be a best practice!

"base direction" is fine for the definition of textDirection.

azaroth42 commented 7 years ago

Proposal:

akuckartz commented 7 years ago

If not already contained in the spec: Please add a suggestion/note to use Unicode. Even a SHOULD-requirement might be appropriate.

azaroth42 commented 7 years ago

@akuckartz We can't recommend unicode for resources outside of our control, and we already require (MUST) unicode for embedded strings, as that's a requirement of JSON.

iherman commented 7 years ago

Discussed 2016-10-28: the group conditionally agree with the proposal, provided @r12a agrees; if so, then the issue can be closed (more exactly, turned into editorial action)

See: http://www.w3.org/2016/10/28-annotation-irc#T15-35-46

r12a commented 7 years ago

commenter satisfied