w3c / csswg-drafts

CSS Working Group Editor Drafts
https://drafts.csswg.org/
Other
4.45k stars 657 forks source link

[css-text] For most languages, hyphens:auto should not hyphenate Capitalized words #3927

Closed jfkthame closed 5 years ago

jfkthame commented 5 years ago

When auto-hyphenation is in use, I believe that in most languages - with German being the major exception - it would be preferable for browsers not to hyphenate capitalized words, which will often be proper nouns. In many cases authors and readers will prefer that names (of people, companies, etc) not be split, and in addition hyphenation rules designed for the "normal" words of a language may fail to hyphenate many names appropriately.

(https://bugzilla.mozilla.org/show_bug.cgi?id=1550532 was recently filed against Gecko about this issue.)

The CSS Text 3 spec explicitly does not specify exactly where hyphenation opportunities occur when hyphens:auto is used. However, I would suggest adding an informative note to the spec, suggesting that browsers may want to suppress auto-hyphenation of capitalized words except when the hyphenation language in use is German.

For CSS Text 4, perhaps a property should be introduced to allow authors to explicitly control this behavior; e.g. hyphenation-capitalized-words: auto | yes | no, where yes and no would have the obvious meaning, and auto would tell the browser to use whatever heuristics it may have, such as considering the current language.

Crissov commented 5 years ago
name, .name, 
::proper-noun
  {hyphens: none;}

if there is sufficient markup or if there was such a semantic pseudo element.

litherum commented 5 years ago

Is a new property really worth it? Is this something authors are asking to be able to control? Can the browser just do it right in the first place?

jfkthame commented 5 years ago

Well, what's "the right thing" for a browser to do regarding hyphenation of capitalized words? I don't think there's a clear answer to that, although I do think browsers should try for a sensible default behavior, and in https://bugzilla.mozilla.org/show_bug.cgi?id=1550532 we just made the suggested adjustment for Firefox.

The problem is that in some cases authors/users may prefer that proper names not be hyphenated (as requested in the Mozilla bug); we can't reliably identify proper names in general text, but we can use capitalized words as the best available proxy for this (except in German); but this has the drawback that we'll also suppress hyphenation of non-names at the beginning of sentences; in some cases, this trade-off may be too great and it'd be preferable to allow capitalized hyphenation after all. I don't think a single hard-coded behavior will ever satisfy all use cases.

(A further refinement to the heuristic -- not yet implemented -- would be to make the behavior dependent on line width, so that as line width is reduced, constraints on what may be hyphenated are relaxed.)

Note that systems such as TeX (the \uchyph parameter) and InDesign (the "Hyphenate Capitalized Words" option in paragraph formatting) do expose this question to authors, recognizing that there is not a simple "correct" behavior that the application can universally use.

Obviously, authors can override the browser's heuristics by adding markup to individual names; the question here is what kind of default behavior, and how much author control, we can/should offer for (the overwhelming majority of) text that does not have that level of detailed markup.

SelenIT commented 5 years ago

I'm not sure that not hyphenating capitalized words in English is a rule and hyphenating them in German is an exception, and not the other way around. At least, AFAIK, in Russian there is no special case for capitalized words regarding hyphenation (only abbreviations are not hyphenated). Maybe a bit more statistics is needed?

jfkthame commented 5 years ago

I don't believe there are (in general) firm rules about this in either direction; it's a judgement call, and may depend on the specific content and the context in which it's being presented, as well as the individual preferences of the author/typographer.

As such, I think the best we can do in CSS is to offer some guidance as to good default behaviors for browsers -- and further information regarding typical usage in various languages may be helpful -- together with adequate controls so that authors can achieve the results they want.

litherum commented 5 years ago

WebKit just got a bug about this too (possibly filed by the same person) https://bugs.webkit.org/show_bug.cgi?id=197889

AmeliaBR commented 5 years ago

Note that systems such as TeX … and InDesign … do expose this question to authors.

This is a very good argument for adding a new property. Does anyone have other examples?

revoltpuppy commented 5 years ago

Hello, I’m the person filing these bugs. I appreciate the discussion. For the record, here's the bug I sent to Blink, too: https://bugs.chromium.org/p/chromium/issues/detail?id=963039&can=2&q=hyphen%20proper%20nouns

I do recognize that it will be difficult to find the perfect solution that works for everyone, but I think there can be more sensible defaults. People don’t like it when their name gets broken at the end of a line. Companies don’t like it when their own materials add hyphens into the middle of their brand names.

Hyphenation should be a progressive enhancement. Over the last 10+ years, I haven’t been able to use it in a professional setting, because I’m always asked to turn it off the instant someone sees their brand name or their own name broken across a line. That’s not an enhancement. I understand that we can turn it off on a case-by-case basis with .name or something similar, but that puts the burden on content owners to wrap every name in a span. That’s not an enhancement either.

I wonder, too, if we could add a new value to the hyphens property, all, instead of having a whole separate property. auto would be updated to hyphenate capitalized words based on language (e.g. in German, but not English) and all would hyphenate capitalized words regardless of language. Or keep auto as currently defined and add no-capitalized-words as the new value.

AmeliaBR commented 5 years ago

I wonder, too, if we could add a new value to the hyphens property, all, instead of having a whole separate property.

Note that there are already multiple properties proposed for controlling hyphenation in CSS Text 4, and other open issues suggesting more control. So adding a single new keyword likely wouldn't be enough.

fantasai commented 5 years ago

I'm happy to add a note to CSS Text saying that UAs might want to use heuristics suppress hyphenation in proper nouns, but I don't think we should define those heuristics in the spec.

("Capitalized words except in German" might want to be "Capitalized words except in German and except after periods", or in a CSS-to-PDF renderer used in publication workflows, even "Capitalized words except in German and except after periods unless we saw it capitalized not after a period." I don't think we'll come up with the ideal heuristics here.)

revoltpuppy commented 5 years ago

The last one, “Capitalized words, except in German, and except after periods, unless we saw it capitalized not after a period,” is the best heuristic I’ve seen so far, and the fact that it’s used in publication workflows backs that up.

Crissov commented 5 years ago

“Capitalized” probably meaning contains a capital letter, not begins with a capital letter to capture “iTunes” and the likes.

jfkthame commented 5 years ago

That's a good point, although in practice I wonder how many such names are actually long enough that hyphenation rules are likely to apply to them? Current browsers don't appear to find a hyphenation opportunity in "iTunes", for example, regardless of casing.

jfkthame commented 5 years ago

...when using English rules; however, I notice that with lang=de, we can hyphenate "iTu-nes". That's probably not ideal.

fantasai commented 5 years ago

@revoltpuppy To be clear, that was a hypothetical example. :) Not very practical for browsers, but much more practical for publication workflows.

css-meeting-bot commented 5 years ago

The CSS Working Group just discussed hyphens:auto should not hyphenate Capitalized words, and agreed to the following:

The full IRC log of that discussion <Rossen_> Topic: hyphens:auto should not hyphenate Capitalized words
<Rossen_> github: https://github.com/w3c/csswg-drafts/issues/3927
<una> florian: so the issue being raised is that in some langs, when words are capitalized you should hyphenate and in some they should not
<una> ... we should bake this into the spec
<una> ... i'd like to close this as wontfix or rejected bc we already say this is dict based within the logic of the lang-based resource
<dauwhe> q+
<una> fantasai: I would go a little farther and say that we should only put a note and not change normative requirements and talk about proper nouns
<una> ... it can suggest i.e. in English you may want to supress hyphenation words that are proper nouns and mixed case
<una> ... I would like to leave the heuristics up to the user agent and not bake anything into the spec
<Rossen_> ack dauwhe
<una> dave: in english should capital letters be hyphenated? maybe... I wouldn't want anythign baked into the spec that says what should happen
<astearns> s/dave/dauwhe/
<una> AmeliaBR: the rec is more to add a suggested note to add in your hyphenation dictionaries you should consider this
<una> ... at least one browser has agreed
<una> ... not sure this is a normative requirement
<una> Rossen_: so proposed resolution for this is to add a note and no normative change
<una> RESOLVED: Add A note to the spec and close with no normative change
<una> florian: myles, a while back you raised 3566 - should we reopen?
asmusf commented 5 years ago

...when using English rules; however, I notice that with lang=de, we can hyphenate "iTu-nes". That's probably not ideal.

That hyphenation somehow implies that Germans pronounce the word eye-too-ness instead of eye-toons or eye-tjoons. It seems "not ideal" for reasons other than capitalization; just as loan words generally aren't regular.

spacecakes commented 2 years ago

Hello, I’m the person filing these bugs. I appreciate the discussion.

Oh, it took me a while to figure out hyphenating capitalized words was regarded as a bug that's now been fixed. For Swedish (which has many long words), it was a pain to figure out why headlines would not hyphenate even though they wrapped onto new lines on small screens. I thought that was a bug.

HeikkiYlipaavalniemi commented 2 years ago

It seems like both Finnish and Swedish languages are suffering from this feature since both languages can have really long words and in a mobile layout they can easily go over the screen width. And if the sentence starts with a word which is longer than the device width, it doesn't hyphenate it at all. Either it will just wrap to the next line without hyphenation or break the layout. We also tested this with changing the first word as non-capitalized and then hyphenation worked correctly.

In my opinion this rule shouldn't apply to the first word in a sentence because its usually always capitalized.

arknu commented 2 years ago

Why would you not want to hyphenate capitalized words? That decision makes absolutely no sense. It seems that, as usual, decisions are taken looking only at English and not taking into account that other languages may have different needs.

For English hyphenation may be a luxury, but for many languages with longer words (Danish, Norwegian, Swedish, Finnish, German and lots more) hyphenation is an absolute necessity for proper text layout, especially on mobile where lines are quite short. You just broke text layout for a large number of languages.

Take this example: image

A single-word headline which, as you would expect, starts with a capital letter. Not getting hyphenated because of this stupid argument. Did no-one stop to think that not hyphenating the first word in a sentence might be a bad idea?

You absolutely CANNOT use capital letters to detect proper nouns. German capitalizes every single noun and those can be quite long. You cannot make random exceptions for different languages (like German in this case) - the spec should be language-agnostic. The web is supposed to work for all languages, yet we are once again seeing the one-sided American view that "every language must work like English".

And why would you not want to hyphenate proper nouns in the first place? They are words like any other and they need to be hyphenated when they would protrude out of their box.

This decision needs to be reverted ASAP. It makes CSS hyphenation pretty much useless, forcing us to use bloated JS libraries for what should be something that just works in the browser. Word processors have been doing automatic hyphenation for decades, it can't be that hard.

Rather than just randomly deciding not to hyphenate capitalized words, you should have added a property to control it. That way you wouldn't be harming everyone, as is currently the case. You can always use CSS to turn hyphenation off for specific words, if needed. But I cannot override this behavior right now.

spacecakes commented 2 years ago

☝️ this, but worded differently. Hyphenation (especially machined-determined) is not pretty, but it's nicer than overflow or arbitrarily split up words.

Does this "fix" even make sense for English? Surely you'd prefer hyphenation over

Incompre hensibiliti es

on small screens with large fonts?

asmusf commented 2 years ago

In quite a few news paragraphs in German, the proper nouns can be the longest words. like that of politician Sabine Leutheusser-Schnarrenberger (which causes additional issues for typesetting because it's already hyphenated).

That said, I've not been able to quickly find websites or online documents that use automatic hyphenation at all. Anybody know some good examples (in various languages)

revoltpuppy commented 2 years ago

Unfortunately the updated suggestions for hyphenation have not yet been adopted by all browsers. Safari closed the change request without comment. This means auto-hyphenation is still fraught because some browsers will hyphenate correctly, while others incorrectly hyphenate proper names (in English).

Because the behavior is undesirable in some browsers, most sites are still making do with other workarounds.

arknu commented 2 years ago

@revoltpuppy The one case where Safari is correct, then. This change is complete nonsense. Proper nouns should be hyphenated like any other word. They can break text layout just as easily. If there are specific words you don't want hyphenated, use a CSS class and a span to disable hyphenation for that word.

r12a commented 2 years ago

That said, I've not been able to quickly find websites or online documents that use automatic hyphenation at all. Anybody know some good examples (in various languages)

fwiw, W3C i18n articles use it (see https://www.w3.org/International/articlelist). The following article has a fair number of translations (i'm aware that some of the links need extra clicks - will fix this): https://www.w3.org/International/questions/qa-doc-charset It will depend on which browser you use as to which languages show hyphenation, and some adjustment of the window width is needed: on Firefox i saw hyphenation for Deutsch English Español Français हिन्दी Italiano Polski Português Português-BR Pусский Svenska Українська

asmusf commented 2 years ago

Thanks. In FF, the German version's hyphenation seems fine, but there seems to be an effort to prevent hyphenation of already hyphenated terms. Given that the elements in a hyphenated compound are not always short, the result are some uneven lines.

Look at the hyphenation of "Dokument-Zeicensatz" to see what I mean. It happens to work out in the title, because breaking a title into two balanced parts is better than filling one line and having just a bit left-over. But that should have been the result of an esthetic rule about type balance for headers. (In the body of the article the same rule leads to some bad line widths).

It's a well-meaning rule (to avoid two different kinds of hyphen in the same compound) but you'd never tolerate the effect in a book. So why in a browser.

jfkthame commented 2 years ago

Look at the hyphenation of "Dokument-Zeicensatz" to see what I mean. It happens to work out in the title,

The title uses hyphens: none in its CSS. If you disable that rule, and make it sufficiently narrow, you can get a result like

Do- ku- ment- Zei- chen- satz

but I'm not sure that browsers (or authors) should be too concerned about optimizing for such extreme cases.

It's a well-meaning rule (to avoid two different kinds of hyphen in the same compound) but you'd never tolerate the effect in a book. So why in a browser.

When you're typesetting a book, you have the luxury of making individual decisions for the specific layout (font, text size, line width, etc) that you're producing. So you can decide whether it's preferable to split

... Dokument- Zeichensatz ...

at the explicit hyphen, preserving each component intact but perhaps leaving the line that ends "Dokument-" a bit short, or to hyphenate one of the components, e.g. resulting in

... Dokument-Zei- chensatz ...

in order to more precisely fill the lines. And you don't have to worry that the reader will suddenly zoom the text (or resize the page) such that only a dozen characters fit on each line.

For a browser that has to dynamically lay out the text, I don't think it's easy to say, in general, which is better; it'll depend on the relative weight given to various subjective factors, and the appropriate balance is likely to be different for very narrow columns than for more "normal" page sizes.

hftf commented 2 years ago

(For controlling line breaking at explicit hyphens, see also the open issue #3434.)

asmusf commented 2 years ago

Even books aren't necessarily static documents any more, if you consider something other than a novel. Non-fiction books may be updated after their first release, and at that point any custom, manual optimizations may bite you.

At the same time, books remain rather unforgiving wrt to badly justified paragraphs. And even ragged-right may look too ragged if you have any rule that rigidly prevents splitting long words.

The type of algorithm that will produce superior results is one that uses weights, and balances poor choice of hyphenation location with other factors such as uneven line length (and where applicable: unacceptably loose or tight text as result of justifying a line.

Such algorithm should be able to cope well with emergency situations.

In the example the title is a single line. If the window gets too narrow, a weight-based algorithm should be able to detect that

Dokument- Zeichensatz

works better than

Dokument-Zeichen- satz

while for a really narrow column may be more natural than having two lines with overflow: Doku- ment- Zeichen- satz

The way to influence such an algorithm would be by raising/lowering the priority (weight) for various line-breaking and hyphenation opportunities, but not by crudely turning some of the off or on.

The key would be to define the controls in relative mode, so that they are not dependent on any absolute weights for a given implementation.

revoltpuppy commented 2 years ago

This change is complete nonsense. Proper nouns should be hyphenated like any other word. They can break text layout just as easily. If there are specific words you don't want hyphenated, use a CSS class and a span to disable hyphenation for that word.

I mean, I could just say if there are specific words you want hyphenated, you should use a class and a span, too. It’s just impractical to put a span around every proper noun, no matter which way you look at it. Names as small as six letters could become hyphenated, and because of that almost nobody (writing in English) uses hyphens: auto, which is the problem in the first place. There are problems with breaking the layout and there are problems with incorrect hyphenation, and we’re trying to find the best balance for the most people.

In English, it is wrong to hyphenate names. In German, it is preferred. The spec already suggests that when German is detected that, yes, proper nouns should be hyphenated. If there are other languages where this is an issue, create a change request so that it can be discussed and those languages can be fixed. If your browser isn’t following the spec’s suggestion, put in a bug fix request with that browser.

arknu commented 2 years ago

@revoltpuppy I strongly disagree that the CSS spec should in any way make recommendations for a specific language. It seems that whoever made this change knew English and a little German and left it at that, not bothering to research the issue more broadly. If that had been done, it would have been discovered immediately that this would not work in most languages.

It's one thing to not hyphenate words with a capital letter in a sentence. But not hyphenating the first word in a sentence is the really critical bit here. Whoever thought that was a good idea?

If English is the only language where you don't want to hyphenate proper names, then why was the change not restricted to English? Why the specific exception for German when many other languages are also affected? The process here is clearly broken. This needs to be acknowledged by the working group and the change reverted so that a proper solution that works for all languages can be worked out.

You have created massive compatibility issues worldwide for years because you didn't want a name hyphenated in English. We finally had working hyphenation in most browsers. Not perfect, but adequate. And then you decided to break it because of some random complaint. Hyphenating capitalized words is NOT a bug, it is a necessary feature!

The arrogance of just assuming that "it works like that it English, then it must be same in other languages" is really getting annoying. So many stupid design decisions in tech have been made because Americans have no concept of how other languages work. This is just the latest.

If all languages had been taken into account from the beginning, hyphenation would have been implemented 10-15 years ago because hyphenation is so important for laying out text properly in a lot languages other than English.

arknu commented 2 years ago

I have used hyphens:auto on many real-world sites, precisely because it solves a real-world problem in Danish that used to require a heavy JS dependency. This site is an example: https://gaarden.nu/ (if will load hyphenator.js if loaded in a browser that doesn't support hyphens:auto).

And this completely random and unannounced spec change has probably created quite a few layout problems on these sites as they rely on hyphenation to have headlines break properly. In Danish, a headline will very frequently start with a long word.

revoltpuppy commented 2 years ago

Believe me, I get that incorrect hyphenation is frustrating. I’ve been there. I’m still there. hyphens: auto deserves a lot of attention, and it could be improved in a number of ways beyond detected language and proper nouns. There are already some useful suggestions posted above.

The good news is, there’s a way to get things fixed, it’s actually pretty simple, and changes can be made pretty quickly! And hurling insults in a closed issue is not that way.

HeikkiYlipaavalniemi commented 2 years ago

@revoltpuppy What are your recommendations to move this closed issue forward?

I think we all agree that hyphenation is hard and the problem is very different in different languages (e.g. English, German, Finnish). Because of this the changes should be tested and considered from multiple angles. In the recent years the hyphens: auto feature has been the most effective tool in Finnish language to make layouts stay intact. We actually have real words like lentokonesuihkuturbiinimoottoriapumekaanikkoaliupseerioppilas (for real) in Finnish which kind of easily break the layout. And the change that the sentences first word is not hyphenated breaks stuff in mobile widths and grid layouts.

We have used for example https://github.com/mnater/Hyphenopoly as a second tool to make it work for all browsers but the native CSS support is in my opinion the best way to solve this problem.

asmusf commented 2 years ago

I repeat my suggestion that correct way to make hyphenation decisions uses ranking/weighting/prioritizing to meet the conflicting goals of avoiding either a loose line (on the one hand) or awkward hyphenation on the other.

Any method that attempts to do this with on/off switches for selecting features will get things badly wrong - and the edge cases will show up.

So you get nonsense like not hyphenating the only word on the line, even if it exceeds total line length just because it fits some on/off criterion.

Or nonsense like a half-empty line.

The right way, clearly, is to recognize that at some point, avoiding an extremely unbalanced line (or avoiding overflow) takes precedent over avoiding things like hyphenating a name (or avoiding hyphenating an hyphenated compound).

Those restrictions are fine as long as they don't cause edge cases. So the way to deal with that is to have a way for any automated setting to override them in layout emergencies.

If we had this understanding then we would most probably not have this discussion, because even a non-optimal set of priorities (paying attention to the rules for one language over others) would not have managed to cause such extreme edge cases.

arknu commented 2 years ago

@HeikkiYlipaavalniemi For me, the way forward is pretty clear: The Working Group needs to acknowledge that this was not thought through properly and that is was a mistake to change the spec in the way it was. We all make mistakes occasionally, and as long as we own up to our mistakes, that is OK. From that, it follows that the change should be reverted as soon as possible, encouraging browser vendors to make the change quickly. That will leave us with a hyphenation system shipping in browsers that works for most languages in most cases.

Then, careful thought should be given to how the specific issues (like not wanting certain words hyphenated) can be addressed in a language-agnostic way. This should involve a group of people from around the world with experience with various different languages so that a solution can be designed taking into account the needs of all languages. This might involve adding additional properties to control various aspects of hyphenation, as seen in page layout software like Adobe InDesign.

frivoal commented 2 years ago

@arknu

While it is perfectly ok to disagree with something and make your opinion known, I would like to encourage you to tone down the virulence of your messages. The kind of language you have been using is not appropriate.

Further more, it seems that you're reacting to the title of this issue, some early comments, or maybe to what certain browsers have been doing on their own, rather than to what has actually been added in the spec, as it doesn't state what you seem to be railing against.

whoever made this change knew English and a little German and left it at that, not bothering to research the issue more broadly. If that had been done, it would have been discovered immediately that this would not work in most languages.

The spec does not give specific rules for English and Germans. It states that this varies per language, and gives English and German as example of languages with different expectations.

The spec contains an exhortation to implementers to be mindful of differences between languages, and explicitly does not define the rules to apply in each language.

But not hyphenating the first word in a sentence is the really critical bit here. Whoever thought that was a good idea?

Nobody thought it was a good idea, and the spec does not say anything about not hyphenating the first word of a sentence, nor does it tell browser not to hyphenate all capitalized words (which would indeed include the first word of a sentence).

As a reminder, this is what was added to the spec.

Authors should correctly tag their content’s language (e.g. using the HTML lang attribute or XML xml:lang attribute) in order to obtain correct automatic hyphenation.

The UA may use language-tailored heuristics to exclude certain words from automatic hyphenation. For example, a UA might try to avoid hyphenation in proper nouns by excluding words matching certain capitalization and punctuation patterns. Such heuristics are not defined by this specification. (Note that such heuristics will need to vary by language: English and German, for example, have very different capitalization conventions.)

This text does not state that all words with capital letters must be prevented from hyphenating. Nor does it say that that must happen in all languages except German. Please calm down. If you find what the spec does say problematic, please be specific about which part is the source of you issue, and what you think should be stated instead.

arknu commented 2 years ago

@frivoal Well, then it seems that every browser has misinterpreted the specification massively, since all browsers that support hyphenation have now changed the implementation to only hyphenate capitalized words in German and not in any other language. And no browser hyphenates the first word in a sentence.

I was sent here after filing a Chromium bug, so naturally I assumed that the spec was the cause of the issue. While it is clear that the spec only gives English and German as examples, this has clearly been misinterpreted by implementers. Which just goes to show that my point about sufficient knowledge of different languages being needed, both when making specs and implementing them.

But I agree that from that text in the spec, browsers have no reason to have the implementation they currently have. I will continue the conversation in the various browser issue trackers.

HeikkiYlipaavalniemi commented 2 years ago

My guess is that the misunderstanding and possible problem with the actual implementation is because of the following comment done by the css-meeting-bot from a discussion in IRC: :

The CSS Working Group just discussed hyphens:auto should not hyphenate Capitalized words, and agreed to the following:

RESOLVED: Add A note to the spec and close with no normative change

This seems to be very different resolution compared to the text in the specification that @frivoal mentioned.

Seems like browsers have followed more the comment resolution than the actual specification.

I agree that the proper way would be to continue the discussion in browser issue queues but I think the specification could also have a note about for example hyphenating sentences first words because they are always capitalized and by default in most cases should be hyphenated.

revoltpuppy commented 2 years ago

@revoltpuppy What are your recommendations to move this closed issue forward?

Open new issues with the spec and with browser vendors about the bugs you are finding instead.

Atnas-dev commented 2 years ago

Due to language rules, some browsers has decided that automatic hyphenation is not allowed on capitalised words as these are seen as "Proper Nouns".

This is indeed true and respects correct gramma within the designated language. However; it creates the bug within the browsers themselves that the first word of a sentence (if capitalised) will NEVER be hyphenated. Also if an editor decides to write a headline in capitalised style, corresponding to the correct language rules of capitalisation. The allowed and truthful capitalised words within this heading will never be capitalised.

It is essential for supporting of creative, appealing, content rich and language appropriate hyphenation. Stating that english will NEVER hyphenate a capitalised word is simply wrong. Stating that English will never hyphenate a proper noun is very true. But the detection of a proper noun being by the style of the text seems to be a very vague and incorrect indication.

In other words. If this issue can not be handled correctly, maybe it should not have been handled at all in the first place?

As mentioned in many other discussions about this issue, an additional hyphenation rule could be added to define whether or not capitalised words should be hyphenated in the given element.

hanshillen commented 1 year ago

Labels in UI controls (links, buttons, tabs, etc.) are generally capitalized, regardless of language, or whether or not they're proper nouns. Similarly, headings often use title case.

The hyphens property would have been ideal to prevent long words in headings and controls from overflowing or getting cut off when zooming to large magnification factors, ensuring a template complies with WCAG SC. 1.4.10: reflow.

The property is not useful if it only works some of the times, and the same word may or may not get hyphenated just because it starts with a capital letter. It's not feasible to expect content providers to write everything in lower case. This means we'll have to fall back on using word-break: break-all on lower viewport widths, which is much less ideal than hyphens could have been.

If there was an extra value that ignores case (e.g., hyphens: all) that a developer could opt into, it would solve everything.

rdhelms commented 1 day ago

I just went down this rabbit hole of 1) being confused why a long non-linguistic string broken at the end of a line only gets a hyphen when lowercased 2) realizing that a workaround is to use lang="de"...?

I agree with @hanshillen that hyphens: all is exactly what I would have wanted and expected. Is something like that being discussed officially anywhere? My desired behavior with that would simply be for hyphens to be inserted anywhere that a string is broken at the end of a line.