w3c / csswg-drafts

CSS Working Group Editor Drafts
https://drafts.csswg.org/
Other
4.38k stars 641 forks source link

[css-content] Quote character choice must depend on surrounding language, not language of the quotation #5478

Open r12a opened 3 years ago

r12a commented 3 years ago

2.4.1. Specifying quotes with the quotes property https://drafts.csswg.org/css-content/#quotes-property

auto A typographically appropriate used value for quotes is automatically chosen by the UA based on the content language of the element and/or its parent.

Note: The Unicode Common Locale Data Repository [CLDR] maintains information on typographically appropriate quotation marks. UAs can use other sources of information as well, particularly as typographic preferences can vary; however it is encouraged to submit any improvements to Unicode so that the entire software ecosystem can benefit.

The i18n WG raised an issue related to the Rendering section of the HTML spec because current implementations choose quote marks based on the language of the quotation, rather than the language of the surrounding text. This is wrong, and needs to be fixed.

It is noticeable when the language of the quotation is different from that of the surrounding text. See these tests:

Since the HTML spec is now to have that section removed (due to introduction of the auto value for the quotes property), we need this requirement to be made clear in the CSS spec.

[Note, btw, that contrary to the request at the start of the HTML thread mentioned above (https://github.com/whatwg/html/issues/3636), it is important to base the choice of quotes on the language of the immediately surrounding text, not the language of the html tag (as is made clearer further down the discussion thread).]

It would also help to include a note showing how content authors can make their styling produce the correct results (since it's far from straightforward, and it will no longer be possible to point to examples in the HTML spec).

fantasai commented 3 years ago

A few key comments from that other issue:

It would be easy to say that auto resolves based on the parent content language rather than the element's own, and I think we should definitely at least do that.

The next consideration is whether we want to get the q { quotes: inherit; } behavior required to get consistent quoting conventions through multiple levels of quotation. Given that auto inherits as itself, it would require some other keyword (like the match-parent keyword we have added to text-align) to resolve the computed value to a specific quoting convention.

css-meeting-bot commented 3 years ago

The CSS Working Group just discussed [css-content] Quote character choice must depend on surrounding language, not language of the quotation, and agreed to the following:

The full IRC log of that discussion <dael> Topic: [css-content] Quote character choice must depend on surrounding language, not language of the quotation
<dael> github: https://github.com/w3c/csswg-drafts/issues/5478
<dael> fantasai: The question raised is we have an atuo keyword for quotes property. Richard points out set you want to use is the set of the parent, not the element's language
<tantek> wow I thought we solved that with the Q element back in the day
<dael> fantasai: Do we want to resolve to define auto value of quote to choose based on parent's content language rather then element's content language?
<dael> Rossen_: Comments?
<fantasai> tantek, this is how the Q element is implemented
<fremy> this seems to make sense
<dael> ??: Works for me and I see Richard's point. Good issue.
<faceless2> s/??/faceless2
<dael> tantek: Agree as well. I thought we did that in E5 Mac for Q element. We should look at Q element
<dael> fantasai: Q element is defined in CSS terms. So we need to fix our spec.
<tantek> s/E5/IE5
<dael> Rossen_: Objections to resolve auto value of quote to be based on parent language
<dael> RESOLVED: auto value of quote to be based on parent language
<dael> fantasai: Second part of issue
<fantasai> https://github.com/whatwg/html/issues/3636#issue-316269336
<fantasai> But Lucy replied: “Embrassez George de ma part et dites-lui, ‘Embrouille’”
<dael> fantasai: If you have quote within a quote you will typically use quotation style of contextual language not immediate parent.
<dael> fantasai: Previously when didn't have auto some discussion on how to do that with selectors and ^ have examples
<dael> fantasai: auto inherits as itself. No way to get that behavior of I am using the quotation marks of my context.
<dael> fantasai: Can get the behavior but need keyword like match-parent. Aline has that to use same resolved alignment as parennt
<florian> s/Aline/align/
<dael> fantasai: We added match-parent which says look at my parent value and if it's phsyical inherit. Logical resolve and inherit
<florian> s/Aline/text-align/
<dael> fantasai: Can do similar here where if my parent value is similar to quote inherit. If it's auto, resolve that to a string and set my computed value to that string and it inherits through.
<Rossen_> q
<dael> florian: Would that be what match-parent does? I think you want to match style but not string.
<AmeliaBR> Quotes already need to keep track of nesting, so can't auto have a behavior that if this is a nested quote, use the same language quotes as the outer level?
<dael> fantasai: Nesting if from open/close quote
<dael> florian: Yes, yes.
<dael> fantasai: Keeping track of nesting we don't have to worry about here. Nesting levels is defined and UA is responcible to choose correct.
<dael> fantasai: Do we want to add a match-parent keyword?
<florian> I think we do
<fantasai> s/define/handled by counters built into the content property/
<dael> Rossen_: I see one support on IRC
<dael> Rossen_: Other thoughts or reasons why we shouldn't?
<dael> jfkthame: Unclear. Is match-parent inUA stylesheet or authors expected to do?
<dael> fantasai: I don't know the answer. If want behavior where using quotation lang of context it's easy in UA stylesheet. That should get correct behavior. If we want that in UA stylesheet I'm less sure but I leave that to i18n and WhatWG. This makes it possible to do
<fantasai> s/that should/q q { quotes: match-parent; } should
<dael> florian: Seems that if we want q element to be able to have the range of behaviors needed to use it properly we should do this. If we believe people don't use q element maybe we can skip, but if we want it to be useful we should do this.
<dael> jfkthame: Then we should do it. I think it's a useful property
<fantasai> s/jfkthame/faceless2/
<dael> Rossen_: hearing people leaning toward the optional keyword match-parent.
<dael> Rossen_: Thoughts of objections to adding match-parent keyword?
<dael> RESOLVED: Add match-parent keyword
r12a commented 3 years ago

Just wondering about the status of this. It doesn't seem that any edits have yet been made, afaict. It seems to me that something along these lines may be useful.

[1]

A typographically appropriate used value for quotes is automatically chosen by the UA based on the content language of the element and/or its parent.

Replace with something like:

A typographically appropriate value for quotes is automatically chosen by the UA based on the content language of the q element's parent. Nested q elements use the same content language for values as the first element.

[2]

Note: If a quotation is in a different language than the surrounding text, it is customary to quote the text with the quote marks of the language of the surrounding text, not the language of the quotation itself.

-->

Note: If a quotation is in a different language than the surrounding text, it is customary to quote the text with the quote marks of the language of the surrounding text, not the language of the quotation itself. Nested quotations continue to use quote marks for the same language as the top-most quotation.

[3]

 Il disait: « Il faut mettre l’action en ‹ fast forward ›. » 

The quote marks around fast forward seem a little ambiguous to me wrt whether they surround a quotation (as opposed to a foreign term). But it also misses the chance to show French quotes on the outside, too. Maybe you have a reason for choosing that example, or maybe it's better to use the following (or both?).

Mais Lucy répondit: «Give George my love and tell him, ‹Muddle›».

fantasai commented 5 months ago

OK, I pushed some edits for this. There's actually two approaches we can take for match-parent:

Anyway, I took the second option (resolve at used-value time only) because it's more consistent with how auto works. But would appreciate WG review. https://github.com/w3c/csswg-drafts/commit/8b2623c9568a69939e7fc854c1148a263ac17b0a

@r12a Currently auto resolves on each element it's used, it's just now changed to look at the parent instead of the element itself. To get the behavior you have in the last example (Mais Lucy répondit: «Give George my love and tell him, ‹Muddle›».) you'd need to spec q q { quotes: match-parent; }. Do you think we should do this by default or is e.g. (Mais Lucy répondit: «Give George my love and tell him, “Muddle”».) a reasonable default?

r12a commented 5 months ago

Currently auto resolves on each element it's used, it's just now changed to look at the parent instead of the element itself. To get the behavior you have in the last example (Mais Lucy répondit: «Give George my love and tell him, ‹Muddle›».) you'd need to spec q q { quotes: match-parent; }. Do you think we should do this by default or is e.g. (Mais Lucy répondit: «Give George my love and tell him, “Muddle”».) a reasonable default?

No, no. Keeping the same style of quote marks for embedded quotes is what i'm asking for in many of my previous comments in this thread, and as the default. Everything i've seen indicates that the internal quotes should be consistent with the external quotes. I therefore don't see “Muddle” as a reasonable default. It should be ‹Muddle›. hth

Also, what happens if you have more than one set of embedded quotes – rare, i grant you, but probably not something to ignore.

I also find myself wondering why auto should be the default setting, since match-parent (where parent means the context outside the outermost quotes) is really what we need by default. It seems to me that the auto setting should produce that most of the time anyway – i hope it will not turn into something that allows browsers to continue doing the wrong thing as they have been doing so far.

fantasai commented 5 months ago

@r12a match-parent won't do the right thing as an initial value because it maps the element's language to a quotation mark system and then inherit with that choice locked--which by default means it'll take the language from the root element (which is the outermost element), map it to a quotation mark system, and trace that system that through the whole document tree. That won't do the right thing in a lot of cases.

frivoal commented 5 months ago

you'd need to spec q q { quotes: match-parent; }

I think we'd need q * { quotes: match-parent; } in the UA stylesheet to get the behavior @r12a wants, and I cannot think (at least not yet) of another way to achieve the same result. Should we do that?

r12a commented 5 months ago

I think i had misread the meaning of auto. Tell me if i'm right in thinking that if the default situation is

q { quotes: auto; }
q * { quotes: match-parent; }

then Mais Lucy répondit: «Give George my love and tell him, ‹Muddle›». would come out as expected, even if there was a third or fourth embedded level, and even if the languages of each embedded level change. That is, the quotes for the top level are set by the language outside the top level (auto), but the quote marks for any embedded levels simply match the quotes used in the top level, regardless of language. (And of course the actual quote characters can vary according to the level, eg. single or double guillemets, but they always match the quotes one would use in the language of the text outside the top level.)

If that's so, then this may be a useful approach for dealing with the q element, but content authors will need to know/remember to use the two lines of CSS for any non-q elements if they hack it themselves. That's a bit of a concern.

frivoal commented 5 months ago

Tell me if i'm right […]

By my understanding, yes, that is correct.

frivoal commented 3 months ago

So, as discussed on a recent i18n call, the above discussion is wrong, because content: open-quote (resp. content: close-quote) is not used on the q element, but on ::before (resp. ::after), which are children of the q element.

So what we would need to get the desired behavior is:

With that in place, the pre-existing ua stylesheet rules

q::before { content: open-quote; }
q::after { content: close-quote; }

would do what i18n expects.

quotes: auto could go back to computing to a set of strings describing the appropriate type of quotes based on the content language of the current element. This would not be used by q, but could be useful in other contexts, such as blockquote elements, which authors could opt into styling either with auto or parent-language. Alternatively, if resolving based on the current element's language that's considered useless in all contexts, auto can be the name we use for the parent-language

xfq commented 3 months ago

This issue was discussed in a meeting.

svgeesus commented 1 month ago

data:text/html,<div lang="zh"><q lang="fr">Quote <q>embedded</q></q> <q>Quote</q>

css-meeting-bot commented 1 month ago

The CSS Working Group just discussed [css-content] Quote character choice must depend on surrounding language, not language of the quotation, and agreed to the following:

The full IRC log of that discussion <emeyer> florian: The internationalization people have told us what they want
<dbaron> s/five other ways/five other ways, whether computed style from font relative units or layout/
<emeyer> …A quote should inject at the start and end the quotation marks of the parent's language, not the element's language
<emeyer> …But if you nest them, the innermost should use the quote of the outermost quotation style
<emeyer> …In the general default case, you have to lock in the quote style and carry it down the ancestor chain
<emeyer> …We think we've come up with some approaches, but they could be performantly painful
<emeyer> The old-fashioned way was to supply a bunch of pairs of strings
<emeyer> 1. The spec has an auto value which computes to itself and represents the quotation system implied by the language of the parent element
<emeyer> fantasai: So if you have English text with a French quote, you use English quotation marks, even though the q elem,ent and its generated content are in French
<emeyer> …But if the French quote has a quote in French or English or whatever, you still want to use the English style quotes because you're in an English paragraph
<florian> q { quotes: auto; }
<florian> q::before,
<florian> q::after,
<florian> q * { quotes: match-parent; }
<emeyer> florian: So to achieve that, there's a match-parent value that computes to itself and means, "use the same quotation system as my parent"
<kizu> +q
<emeyer> …This does put a `q *` selector in UA stylesheets
<emeyer> …If you think this doesn't solve as stated, say so; if you think it's fine to put it in UA stylesheets, say so
<astearns> ack kizu
<emeyer> kizu: If we had a content pseudo-element, we might use this for quotes?
<emeyer> fantasai: No, that wouldn't work, because inheritance is before box construction
<emeyer> florian: 2. Uses different keywords not specced yet, but let's call one 'parent-language'
<emeyer> …This is like 'auto', but it computes to the set of strings the language needed, and inherits as those strings
<florian> q { quotes: parent-language; }
<florian> q q { quotes: inherit; }
<emeyer> s/but/except/
<emeyer> …If you're a quote element, you lock on your parent, unless you're nested, in which case you inherit the strings
<emeyer> …I don't know which UAs would prefer, I suspect the latter
<emeyer> …If neither works, what do we do??
<emeyer> dbaron: If both are bad, we could solve it by having a value that uses the parent language only if the parent has the initial value, and otherwise inherits from the parent
<emeyer> florian: Say again?
<emeyer> dbaron: 'parent-language-or-auto', uses the logical value 'auto' would use, otherwise inherits
<kbabbitt> q+
<emeyer> florian: It computes to itself and stays as is?
<emeyer> dbaron: All I need auto for is it's the initial value of the property
<emeyer> …So on the outer element it would look at the parent to get the language
<emeyer> (crosstalk)
<emeyer> …Then if you have a nested q element, you look at the parent, it isn't auto, so you just inherit it
<emeyer> florian: I agree that would also work
<emeyer> …One thing to not forget and I don't think you did, the place where the actual quotes are injected, they're the before and after, but that should still work with what you described
<emeyer> …Great, now we have three solutions; which is most implementable?
<astearns> ack kbabbitt
<emeyer> kbabbitt: I vote against putting a universal selector in a UA stylesheet, as it will likely trigger a lot of style invalidation
<emeyer> dbaron: I would defer to Rune or Anne for their thought on option 2
<dbaron> s/or Anne/or Ian/
<emeyer> florian: internationalization is reluctant about option 2, since you have to remember to put a bunch of non-obvious things in stylesheets
<emeyer> …I think David's thing is closer to what they want
<emeyer> dbaron: That's true, the downside is it's more magical and more weird
<emeyer> …We'd need a name
<emeyer> fantasai: match-parent?
<emeyer> dbaron: noooooooooo
<emeyer> florian: It doesn't quite work because it has to do different things
<emeyer> fantasai: Another one is to call one value normal and one auto?
<emeyer> dbaron: We could name for what it isn't for instead of what it does
<TabAtkins> for-q
<emeyer> florian: Bikeshedding aside, between option 2 and 3, which is better?
<TabAtkins> [me yelling at the dude who just sideswiped me] "FOR-Q"
<emeyer> futhark: I'm a little confused about the options, but avoiding the universal is good
<fantasai> 1. q { quotes: auto; }
<fantasai> q * { quotes: match-parent; }
<fantasai> 2. q { quotes: parent-language; }
<fantasai> q q { quotes: inherit; }
<emeyer> dbaron: Option 2 had a `q q` selector
<fantasai> 3. q { quotes: magic; }
<emeyer> …option 3 has only `q` because all the nmagic is in the value
<emeyer> futhark: I think I prefer that
<ChrisL> q {quotes: anon }
<emeyer> florian: All the magic happens at compute time, which is a different concern
<emeyer> futhark: Does it have to walk up the tree?
<emeyer> dbaron: It involves lookiong at the parent and maybe pulling stuff down
<emeyer> futhark: That sounds the best option
<emeyer> florian: So we can go back to bikeshedding optoin 3
<emeyer> jfkthame: Can we just not change the value of auto?
<emeyer> florian: ??
<emeyer> iank_: That would change the existing quotes property?
<emeyer> florian: The value now is only strings, there is no auto
<astearns> s/??/we could use auto, but then we need an initial value - maybe normal?/
<emeyer> fantasai: But everyone's shipping auto
<emeyer> florian: What does it do?
<emeyer> fantasai: [missed]
<emeyer> florian: So auto does something and has to be kept and we need a new name
<emeyer> fantasai: capture?
<emeyer> florian: Maybe
<emeyer> astearns: I prefer going back to the issue while resolving on behavior
<dbaron> quotes: outer-language ?
<fantasai> It wouldn't be outer language if you nested three languages though
<dbaron> Proposed resolution: adopt a quotes value that, if the parent's quotes computed value is auto, computes the correct quotes for the parent's language, and if the parent's quotes computed value is not auto, inherits that value.
<bkardell_> I'm good with that
<emilio> q+
<emeyer> emilio: Doesn't this require you to walk arbitrary ancestors?
<emeyer> florian: You only have to look at the parent
<emeyer> emilio: This would only apply to the quote element?
<emeyer> fantasai: You could put it on any element
<emeyer> dbaron: This will mostly work for style computation and inheritance
<emeyer> …The special inheriatnce rules here will only require looking up one step for each element
<emeyer> emilio: But it would expose ?? data for the language
<emeyer> dbaron: Yes
<fantasai> s/??/CLDR/
<emeyer> emilio: Okay, I think it might be fine
<emeyer> dbaron: I guess that might be a substantive change in that you need to do CLDR seek at computation time rather than layout time
<emeyer> emilio: I don't think that's a hard lookup to do, but I don't know how much we can expose that data
<emeyer> dbaron: I think you can get it with text data
<dbaron> s/text data/text measurement/
<emeyer> fantasai: CLDR doesn't vary by locale, it's ???
<emeyer> florian: It exposes information about the page, not the environment
<dbaron> s/???/something baked into the browser/
<emeyer> emilio: As long as that's the case, it's okay
<emeyer> RESOLVED: adopt a quotes value that, if the parent's quotes computed value is auto, computes the correct quotes for the parent's language, and if the parent's quotes computed value is not auto, inherits that value
<emeyer> florian: I think we should drop match-parent, which doesn't do what we want
<emeyer> astearns: Concerns?
<fantasai> PROPOSED: Drop existing match-parent value
<emeyer> dbaron: How widely implemented is it?
<emeyer> florian: I don't think it is
<emeyer> astearns: Objections?
<emeyer> RESOLVED: drop existing match-parent value
<emeyer> florian: Now, what does auto do? I think there are two different behaviors
<emeyer> …Unless we have an obvious answer, it should be a separate issue
frivoal commented 1 month ago

See also follow up discussion in https://github.com/w3c/csswg-drafts/issues/10436

r12a commented 1 month ago

[This and the following comments were moved to https://github.com/w3c/csswg-drafts/issues/10468]

I was just reviewing an article i wrote in 2018 but haven't yet published, pending the resolution of this discussion. Some parts, esp. related to how to write CSS, need to be rewritten after the dust settles. The following section may be useful to describe the problem we are attempting to solve: https://w3c.github.io/i18n-drafts/questions/qa-the-q-element.en#how

However, the section Problems with scope describes an issue that i'm not sure we have yet discussed: If you apply a font change to a quotation in another language marked up with the q element, the font change also affects the quote marks, but the quote marks should be treated as part of the surrounding text, and remain in the same font as the stuff around the quote.

Should we discuss that here, or raise a separate issue?

jfkthame commented 1 month ago

If you apply a font change to a quotation in another language marked up with the q element, the font change also affects the quote marks, but the quote marks should be treated as part of the surrounding text, and remain in the same font as the stuff around the quote.

Probably worth discussing in a separate issue. Logically, I agree that the quote marks "belong" to the context of the quoted text, but if the quote is styled quite distinctively, I suspect it could look quite jarring to surround it with quote marks that don't match its style.

(I'm sure I've seen typographic guidelines that insist punctuation marks should use the style of the adjacent word, and haven't said anything about an exception for quote marks.)