Closed zcorpan closed 7 years ago
Yeah, falling back to parentheses when there is no rp
makes sense to me.
cc @yosinch
PR for spec: https://github.com/whatwg/html/issues/2113 PR for wpt: https://github.com/w3c/web-platform-tests/pull/4259
@jfkthame is there interest to implement this in Gecko? @tkent-google is there interest to implement this in Chromium? @travisleithead is there interest to implement this in Edge?
@upsuper wdyt about this? Would you like to take it for gecko?
There is an issue that, Gecko implements the ruby model from CSS Ruby spec, which is more complicated than that in the current HTML spec. The CSS model supports continuous <rt>
elements as well as <rtc>
element, which means the algorithm you proposed in #2113 wouldn't work for Gecko.
[Slightly offtopic: this proposal actually again highlights the defect of HTML's ruby model. This model fails to express words like "振り仮名 → 振り仮名(ふりがな)" in a reasonable way which has desired behavior on both rendering and plain text. HTML spec should really adopt the CSS Ruby model.]
There is also a question that whether the parentheses should be proportional or fullwidth (w3c/csswg-drafts#762). I think for CJK languages, majority of people would prefer either fullwidth parentheses or proportional parentheses with whitespace around. Maybe proportional parentheses are more desirable for other languages? Although it seems to me CJK languages (especially Japanese) are the main user of ruby.
Personally I don't like to see the algorithm of innerText
becomes increasingly complicated. IIUC, it was speced this way for web compatibility, not really because of its distinct functionality (?). And thus I don't think it's worth adding anything to it unless for compatibility reason. I may be wrong about this.
Thanks @upsuper.
So with rtc
, an rp
might be a sibling of the rtc
but not sibling of the rt
. The algorithm could be changed to accommodate that, but first we should decide whether to do this at all.
I think fullwidth parentheses should be used if that is commonly used by CJK.
You are correct that innerText
was added mainly for better Web compat.
I'm happy to drop the proposal if people think it is not worth it. My question then is, should we also drop the special handling of rp
currently in the spec, which is implemented only in Gecko at the moment?
(The ruby model is issue #121.)
So with
rtc
, anrp
might be a sibling of thertc
but not sibling of thert
. The algorithm could be changed to accommodate that, but first we should decide whether to do this at all.
rtc
, rt
, and rp
can be a sibling of each other. The rule to add parentheses could be complicated. There is an attempt in CSS Ruby spec for generating parentheses automatically, but that rule isn't perfect, and probably doesn't fit well with description language used for innerText
algorithm.
My question then is, should we also drop the special handling of
rp
currently in the spec, which is implemented only in Gecko at the moment?
I'm fine with doing this if no one else opposes.
I agree with @upsuper about the last paragraph of https://github.com/whatwg/html/issues/1801#issuecomment-263584759. Introducing new behaivor which is not compatible with any existing implementation isn't welcome.
Thanks. I've withdrawn the proposal. I will make a new pull request to drop the special handling of rp
.
I think for CJK languages, majority of people would prefer either fullwidth parentheses or proportional parentheses with whitespace around. Maybe proportional parentheses are more desirable for other languages? Although it seems to me CJK languages (especially Japanese) are the main user of ruby.
That’s note quite true. People DO use proportional (half-width) parenthesis in Japanese without spaces. I’ve rarely seen anyone inserting spaces around parenthesis in Japanese for that matter.
You are correct that
innerText
was added mainly for better Web compat.I'm happy to drop the proposal if people think it is not worth it. My question then is, should we also drop the special handling of
rp
currently in the spec, which is implemented only in Gecko at the moment?
Inserting parenthesis is quite important for copy & paste (otherwise important content can be lost during copy). WebKit uses the same algorithm for both innerText
and coy & paste so this is quite important for us.
I think for CJK languages, majority of people would prefer either fullwidth parentheses or proportional parentheses with whitespace around. Maybe proportional parentheses are more desirable for other languages? Although it seems to me CJK languages (especially Japanese) are the main user of ruby.
That’s note quite true. People DO use proportional (half-width) parenthesis in Japanese without spaces. I’ve rarely seen anyone inserting spaces around parenthesis in Japanese for that matter.
I agree, but since we have to choose one, I think you'll find "typically" if you look at referring bugs and discussion at I18N WG, and I agree with I18N that if we pick typically used one, that'd be fullwidth.
The larger issue than width is the baseline. ASCII parentheses are usually designed to match to x-height, which is too low to use for CJK, while fullwidth parentheses are designed to match to em-height. There are a few fonts that has em-height parentheses for ASCII parentheses but they're really a few, I know only 3, because doing so sacrifices English rendering.
In today's fonts environment, if we want parentheses that matches to CJK without extra spacing, we need to use fullwidth code points with pwid
OpenType feature.
Inserting parenthesis is quite important for copy & paste...
I'll leave @tkent-google on whether we want to do this or not.
In today's fonts environment, if we want parentheses that matches to CJK without extra spacing, we need to use fullwidth code points with paid OpenType feature.
The problem here is that this would mean that the lack of rp
would now result in a full-width parenthesis being inserted even in English and Latin text, which is highly undesirable. Using half width parenthesis, on the other hand would still work for CJK even if it weren't ideal. We might need to resolve the current language from the nearest ancestor and decide whether to use full width or not.
I know some people are taking about Ruby's useful for Latin and other languages, but have never seen single page using it. Have you?
Either way, it looks like Gecko and Blink doesn't want this. Maybe we should try to reach consensus on it first. Well, it was probably me who added the noise, sorry about that.
https://html.spec.whatwg.org/multipage/dom.html#the-innertext-idl-attribute
The
innerText
getter has a special case forText
nodes that are children ofrp
elements; the text is included even thoughrp
is 'display:none' by default.Demo: http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4488
This is nice but I think it is more common to omit
rp
and only usert
, and in that case it's not helping.The rendering section has:
https://html.spec.whatwg.org/multipage/rendering.html#phrasing-content-3
I think if we are going to special case ruby in
innerText
at all it would be good to make it "nice" also ifrp
is not being used, like in the rendering section.Concretely, if a
ruby
element has norp
children, include "(" beforert
children and ")" after.cc @rniwa @rocallahan @jfkthame
Implementer interest: