w3c / epub-specs

Shared workspace for EPUB 3 specifications.
Other
306 stars 60 forks source link

UTR#50 will have impacts on vertical writing in EPUB3 #195

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Unicode Consortium is developing Unicode Technical 
Report #50 Unicode Properties for Vertical Text Layout. 
It is expected to be used by CSS Writing modes of W3C.  

Since the semantics of EPUB vertical writing depends 
on the latest text for CSS Writing modes, EPUB3 will 
automatically use UTR#50.  In my understanding, 
this will introduce some incompatibilities: what 
was undefined will be defined.

When UTR#50 becomes maturer, we might or might not want 
to change EPUB3.

[1] http://www.unicode.org/reports/tr50/tr50-1.html

Original issue reported on code.google.com by eb2m...@gmail.com on 5 Nov 2011 at 11:47

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
There are some alternative solutions.

1) Do nothing.

This means that the semantics is defined by the latest version of CSS Writing 
Modes as well as that of 
UTR#50.

Pros: complete compatibility with the latest version of CSS Writing modes and 
UTR#50
Cons: character orientation will change from time to time (poor fidelity)

2) Introduces a mechanism for specifying the version of the UTR#50 
    as part of the EPUB publication document

The semantics is frozen as far as character orientation is concerned.

Pros: Intended character orientations are clear and will not change.
Cons: Slight incompatibility with CSS Writing modes 

3) Specifies default character orientations in EPUB publication documents

A single default is unlikely to make everybody happy.  First, it is now clear 
that
different Japanese have different opinions and they are unlikely to change 
their mind.  Second, different natural languages are likely to require 
different 
defaults.  Third, different RSs already use slightly different defaults.

Pros: Intended character orientations are clear and will not change.
         Convenient defaults can be specified
Cons: Risks of failures to honor specified defaults and thus even poorer 
interoperability
          Slight incompatibility with CSS Writing modes 

Original comment by eb2m...@gmail.com on 8 Mar 2013 at 6:18

GoogleCodeExporter commented 9 years ago
Here are some further details for the third option, namely "Specifies default 
character orientations in EPUB publication documents".

First, I'm looking for existing non-UTR50 defaults.

One suboption of this third option is to introduce some syntax for specifying 
the default.  
An obvious choice is the syntax used in UTR#50 for defining the default, but 
this syntax 
may change when properties or permissible values are changed.

Another suboption is to create  a registry of named defaults, and some 
organization 
becomes the maintainer of the registry.   Some naming conventions might allow 
fallback when RSs do not recognize specified defaults.

Original comment by eb2m...@gmail.com on 31 Mar 2013 at 6:01

GoogleCodeExporter commented 9 years ago
I think the main purpose of defining the default behavior in vertical layout is 
to reduce unnecessary work. But I strongly feel it depends on language.
Take double quotation (U+201C) for example. When it is used in Japanese text, 
the it is easy for content creators to be rendered "upright" in vertical 
layout, but it will be expected to be shown sideway ,when it i used in English 
text.

Please take a look at W3C specifications. You will see vertical text in English 
at the top left. Vertical layout gives much more capabilities to non Japanese, 
non Chinese text. I think UTR #50 which is based on Japanese text experience 
might not acceptable for other language speakers, and the default behavior 
should be based on each language.

Original comment by tkanai...@gmail.com on 12 Apr 2013 at 11:00

GoogleCodeExporter commented 9 years ago
Prefer #2, ok with #1, disagree with #3.

Original comment by kojii...@gmail.com on 20 Apr 2013 at 3:35

GoogleCodeExporter commented 9 years ago
> disagree with #3.

Could you elaborate the reason?

When will the next draft of UTR#50 be published and CSS Writing Modes
reference it?  Lots of EPUB publications in Japanese have been prepared 
without knowing the final outcome of UTR#50, and more are being 
prepared.  (I am wondering if "everything is implementation dependent" 
is the best solution because it does not invalidate existing publications and 
RSs, 
which might not be changed no matter what we decide.)

By the way, here is a Japanese comment on this issue as part of the 
DTS ballot.

-------------------------------------------------------------------------
This issue is tricky and we do not have a good answer.  Hopefully 
further progress of UTR50 and CSS Writing Modes will make 
things easier.

We request that at least Hiragana, Katakana,
and Kanji characters are guaranteed to be laid
out upright.  We know that this does not
mean much however.

We have considered many options about this
topic.  Font-dependent solutions, an attribute in
OPF for specifying the version of UTR50, a language
for specifying defaults, and a registry of defaults.
However, we have not reached consensus.

Furthermore, we hear that some Taiwanese are concerned
about UTR50.  They believe that it is not very appropriate
for the Taiwanese language.  Other areas such as Hong Kong 
might have different requirements.  If a single default does 
not work, we might need a language-dependent default mechanism.

Original comment by eb2m...@gmail.com on 24 Apr 2013 at 12:32

GoogleCodeExporter commented 9 years ago
> Could you elaborate the reason?

It's an in-page layout feature that must be implemented within the rendering 
engine. Such feature is hard to implement without Unicode/W3C consensus.

I also see very little value of doing so. If an author wants different default 
value, anyone can develop a pre-processor to output W3C/Unicode conformant 
HTML/CSS.

> When will the next draft of UTR#50 be published and CSS Writing Modes
> reference it?

Editors cannot make such commitment; it's UTC to resolve issues, not editors. 
Next UTC is in May. CSS will automatically follow new drafts as they're 
published on its next publish.

> more are being prepared

I didn't understand this, sorry.

> We request that at least Hiragana, Katakana,
> and Kanji characters are guaranteed to be laid
> out upright.

This request is already taken at UTC, except halfwidth Katakana.

> We have considered many options about this
> topic.  Font-dependent solutions, an attribute in
> OPF for specifying the version of UTR50, a language
> for specifying defaults, and a registry of defaults.
> However, we have not reached consensus.

I don't know who "we" are here, but I heard from EBPAJ that they're going to 
stick on draft #6 until UTR#50 goes Draft, and once it goes Draft, they could 
define a proprietary property to indicate so. So in my understanding, at least 
40 or so Japanese publishers are in consensus to have only two versions of 
UTR#50, and a property to distinguish the two.

The reason I prefer #2 is to make the property non-proprietary.

> Furthermore, we hear that some Taiwanese are concerned
> about UTR50.  They believe that it is not very appropriate
> for the Taiwanese language.  Other areas such as Hong Kong 
> might have different requirements.

They can send their concerns to Unicode, or you're welcome to proxy if you want 
to. "We'll re-define our own because we don't like it" doesn't look like a 
great approach to me.

> If a single default does 
> not work, we might need a language-dependent default mechanism.

A single default does not work for anyone, including Japanese. There's no such 
thing, and UTR#50 does not have "a single default that can layout all documents 
correctly" in its scope. It merely provides a default for document interchange 
purpose, and higher-level protocols should provide a way to specify 
orientations. CSS provides text-orientation property for that purpose.

It's like defining a default font size. There will never be a single correct 
answer, but if 60% of documents in the world use 8pt, then for document 
interchange purpose, make the default to 8pt and provide a way to override it 
is the only thing standardization can do. Japan and other Han-based scripts may 
need to override it always because of its complex glyphs, but still documents 
are interchangeable.

Original comment by kojii...@gmail.com on 24 Apr 2013 at 6:18

GoogleCodeExporter commented 9 years ago
Thanks for your clarification.

>> more are being prepared
>I didn't understand this, sorry.

More EPUB publications in Japanese (vertical writing) are 
being prepared.

>I don't know who "we" are here, 

The mirror for the SC34 as well some members of the IEC TC100/TA10.

> I heard from EBPAJ that they're going to stick on draft #6 until UTR#50 goes 
Draft, 
>and once it goes Draft, they could define a proprietary property to indicate 
so.  So 
>in my understanding, at least 40 or so Japanese publishers are in consensus to 
>have only two versions of UTR#50, and a property to distinguish the two.

It is nice to hear that they have a position.  But I have a question.  AFAIK, 
there 
are few (no?) RSs that uses some version of UTR#50 without some proprietary 
changes.  How will they handle such proprietary changes?

Original comment by eb2m...@gmail.com on 24 Apr 2013 at 9:36

GoogleCodeExporter commented 9 years ago
>>> more are being prepared
>>I didn't understand this, sorry.
>
> More EPUB publications in Japanese (vertical writing) are 
> being prepared.

I guess you mean being created and published, correct?

> The mirror for the SC34 as well some members of the IEC TC100/TA10.

Are they users, authors, publishers, vendors, or standardization engineers? I 
haven't heard of possibility or wish of #3 from anyone, so I'm a bit surprised 
who I missed to listen to.

> But I have a question.  AFAIK, there 
> are few (no?) RSs that uses some version of UTR#50 without some proprietary 
> changes.  How will they handle such proprietary changes?

Most of differences are tiny in regards to use in Japanese. A few may have 
impact, where authors already put text-orientation property. So as long as the 
Draft changes are within what they expect, almost no actions are required. If 
the changes are larger than that, they might want to put a proprietary property 
as mentioned earlier.

Since a proprietary property isn't nice, #2 might be able to help if such 
changes occur.

Original comment by kojii...@gmail.com on 25 Apr 2013 at 1:35

GoogleCodeExporter commented 9 years ago
> >
>> The mirror for the SC34 as well some members of the IEC TC100/TA10.

>Are they users, authors, publishers, vendors, or standardization engineers? I 
haven't heard of possibility or wish of #3 from anyone, so I'm a bit surprised 
who I missed to listen to.

Kobayashi-san of Antenna House.  He is unhappy with UTR#50.

Original comment by eb2m...@gmail.com on 25 Apr 2013 at 1:46

GoogleCodeExporter commented 9 years ago
I'd like to describe UTR #50 issue in Traditional Chinese.
Apple's iOS 6.1 implied UTR #50 into it's own webkit. But UTR #50 doesn't 
contain Chinese usage, that make punctuations not in right orientation in 
iBooks.  

Such as:

";FULLWIDTH SEMICOLON U+FF1B" should be upright in vertical writing. Apple 
dev team replied that we can use CSS to fix this issue or provide revision to 
UTR #50.
"~FULLWIDTH TILDE U+FF5E" also should be upright in vertical writing. This 
one can't be upright even with CSS.

Both of them are frequently used in Chinese books. 

2 is ok, but prefer 3 as a reference to reading systems (especially 
international ones) to imply. That will be quick solution to problem confronted.

Original comment by bobbyt...@wanderer.tw on 25 Apr 2013 at 3:39

GoogleCodeExporter commented 9 years ago
> Kobayashi-san of Antenna House.  He is unhappy with UTR#50.

That is understandable, because what he says is different from the scope of 
UTR#50.

> ";FULLWIDTH SEMICOLON U+FF1B" should be upright in vertical writing.
> Apple dev team replied that we can use CSS to fix this issue or provide
> revision to UTR #50.

I agree with Apple dev team. The scope of UTR#50 is to provide a default value 
for document interchange, and recommend higher-level protocols such as CSS to 
override. There are a lot cases in Japan too that a specific code point 
requires authors to use text-orientation property to override. The goal of 
UTR#50 is to make it clear that which character appear in which orientation by 
default so that author knows which character needs to use text-orientation 
property to override.

> "~FULLWIDTH TILDE U+FF5E" also should be upright in vertical writing.
> This one can't be upright even with CSS.

If you can't change orientation by CSS, it's bug either in WebKit or in the 
font. You need to talk to your vendor.

It sounds like, all these discussions are something to be done at Unicode. What 
we'd like to discuss here is, do we want to ignore Unicode and W3C and make our 
own definition, or follow them. I prefer to follow them, as it's not easy to 
implement if EPUB goes different direction from the Open Web Platform.

Original comment by kojii...@gmail.com on 30 Apr 2013 at 1:08

GoogleCodeExporter commented 9 years ago
I do not buy #3 yet, because I am not sure if different defaults are registered 
and honored.  But this option at least provides a consistent story and I see 
nothing wrong in creating an EPUB-specific mechanism for creating different 
defaults.  It is simply impossible for UTR#50 to be all mighty for every 
document format, every natural language, or every publication genre.  Rather, 
as I see it, UTR#50 now makes somebody extremely unhappy.  I also think that if 
perfect equivalence to bare HTML+CSS is required, we should not adopt #2.

Original comment by eb2m...@gmail.com on 30 Apr 2013 at 10:33

GoogleCodeExporter commented 9 years ago
My immediate thought is "who can contribute Chinese usage for UTR #50 in 
Taiwan?" I've found a font foundry to provide information about Chinese 
character orientation in vertical writing.  But as i know, no one in Unicode 
Consortium is from Taiwan. That may take long time to go. Try my best for that.

Original comment by bobbyt...@wanderer.tw on 2 May 2013 at 8:55

GoogleCodeExporter commented 9 years ago
Bobby, UTR#50 is not something you understand. It does not provide a good value 
for anyone, just a single set of orientation where anyone can write differences 
against. Please read Overview and Scope section carefully, not only data. You 
will then find it's not a good orientation for any single script.

That said, there's nothing called UTR#50 for Taiwan. There will be a way for 
Taiwanese to use CSS Writing Modes Level 3 where such users will always apply 
text-orientation property to a certain set of characters. CSS may seek for a 
method to make such use easier in CSS Writing Modes Level 4. Until then, 
anyone, including Japanese, need to apply text-orientation property to make the 
desired glyph orientations as needed.

Original comment by kojii...@gmail.com on 3 May 2013 at 7:16

GoogleCodeExporter commented 9 years ago
Re: CSS may seek for a method to make such use easier in CSS Writing Modes 
Level 4

My option #3 is nothing but an attempt to introduce such a method as 
part of EPUB 3.0.1.  Such an external mechanism for declaring defaults 
have historically been used for base URIs (see 5.4.1 in 
http://www.ietf.org/rfc/rfc2396.txt) and the lang attribute (see 
http://www.w3.org/TR/html5/dom.html#the-lang-and-xml:lang-attributes).
It might be a good idea for CSS WM to allow such external defaults.

I still have mixed feeling about #3 and feel at a loss.  But I have 
come to dislike #2, since it does not convenient defaults 
and it does not provide complete equivalence with CSS WM.  

Original comment by eb2m...@gmail.com on 3 May 2013 at 11:04

GoogleCodeExporter commented 9 years ago
From implementer's point of view, #3 is a nightmare. If the feature must modify 
the rendering engine, please let W3C to do so.

I'm happy for IDPF to define features outside of the page box, not not inside, 
especially when authors can workaround the problem by using pre-processors.

Original comment by kojii...@gmail.com on 4 May 2013 at 4:16

GoogleCodeExporter commented 9 years ago
I think that #1 (Do nothing) is probably the best and that we should 
recommend thorough markup and explicit text-orientation to everybody.

Original comment by eb2m...@gmail.com on 4 May 2013 at 4:29

GoogleCodeExporter commented 9 years ago
As resolved on the WG 20130822 call [1], the 301 spec will make no explicit 
mention of UTR50 (apart from whats inherited from CSS WritingModes). 

It was noted on the call that a private extension to OPF metadata may be put in 
use, and if proven useful, may be included in a future revision of EPUB 3. 

[1] 
https://docs.google.com/document/d/19hgdsyWiGXKc-CUZlOA3PeaVdvR88DEkcIjsCpdXZAE/
edit

Original comment by markus.g...@gmail.com on 28 Aug 2013 at 8:51