[css-pseudo-4] `::nth-letter` pseudo-element

adactio commented 6 years ago

I'd like to propose that there be an ::nth-letter pseudo-element in CSS.

I've written about it on my website.

Previous calls:

The last time this came up, @frivoal said:

This sits at the uncomfortable intersection of useful and freaking hard to implement.

But also:

As for ::nth-letter, I am not aware of any substitute, but neither am I aware of strong use cases.

I would argue that the strong use cases are being demonstrated every day by any site using these JavaScript libraries:

Or these logo examples in the wild:

...along with most of these variable font demos.

Whether using HTML, or using JavaScript libraries, the only way to target specific letters is to wrap them in uneccessary elements. This can cause accessibility issues—a logo being read out letter-by-letter, for example, rather than as a whole word.

I understand that this would be hard to implement (given that ::first-letter was hard enough) but I think the use-cases and accessibility benefits could outweigh the potential difficulty.

tomhodgins commented 6 years ago

I've published a demo earlier this week for :nth-letter() and :nth-word() pseudos, including a JavaScript implementation that can be tested/used from CSS: https://codepen.io/tomhodgins/pen/YJZyPr

tabatkins commented 6 years ago

All of these examples (and I expect, nearly all reasonable examples you could dredge up beyond these) are fancy headings; any usage of ::nth-letter in body text would almost certainly be... dodgy at best. ::first-letter has its classic use-case of drop-caps (which we're blessing with specialized properties!), but even with that, there's a very good chance we wouldn't consider it worthwhile if someone tried to introduce it today!

One-off effects on top-level headings in a document still seem to be well-handled by just putting some spans around the letters you want to individually style. Letting people just do that, rather than trying to deal with all the complexities and weirdness of ::nth-letter, is probably still the way to go.

(For example, I don't think any of those examples work if a "real" element splits the character, like fo[COMBINING ACUTE ACCENT]o. Looks like three letters, fóo, and any reasonable definition of ::nth-letter(2) would have it targeting the entire ó letter, but the pseudo-element has to be split into two separate boxes due to the existing tree structure.)

upsuper commented 6 years ago

Yeah... ::first-letter really taught us that it isn't something worth doing :) It should have been done in the initial-letter way...

fantasai commented 6 years ago

@upsuper initial-letters is a layout property that happens to apply to :first-letter, it doesn't magically select the first letter the way text-combine-upright: digits magically selects numbers ...

@tabatkins I don't think any reasonably marked up document would put an element boundary there, so it's OK for that to be handled as a simplified error of some kind. I think the basic point is that pseudo-elements of this type make implementers very unhappy* so a moderately convincing use case with an easy workaround is likely not enough motivation.

A fun example is even-odd styling of :nth-letter() over a paragraph. Now insert a caret at character 5 and start typing. Somehow that has to work as expected, and reasonably preformantly as well because that's what the user expects. Other fun considerations include “exactly how do Indic scripts work here?”, “what about Dutch ij?” (which is probably not the same way as :first-letter), and other i18n inquiries.

adactio commented 6 years ago

@tabatkins said:

One-off effects on top-level headings in a document still seem to be well-handled by just putting some spans around the letters you want to individually style.

Like I said, I knew it would be difficult to implement. Still, I'm surprised that difficulty of implementation would win out over the accessibility problems of wrapping individual letters in a bunch of spans.

All of these examples (and I expect, nearly all reasonable examples you could dredge up beyond these)

Dredge up.

Interesting.

tabatkins commented 6 years ago

Still, I'm surprised that difficulty of implementation would win out over the accessibility problems of wrapping individual letters in a bunch of spans.

The hierarchy of concerns tells us that certain players in the space matter more than others; user-a11y is generally more important than implementation difficulty. It is not, however, absolute - ::nth-letter() has significant implementation difficulties, over and above even that of ::first-letter which is already troublesome, and has less-common use-cases. All of these combine to at least make ::nth-letter() a rather low priority project, versus many other things we could add to CSS.

Solving the a11y issues can be done more simply than adding ::nth-letter(), such as having something in HTML or CSS that says "this element is not a semantic separation of its contained text from the surrounding text", which could be applied to the spans. This would also solve slightly wider problems than ::nth-letter(), as anything that splits up a word for any reason could also use it, rather than the solution being limited solely to the use-case of applying different CSS to letter-based groupings.

Dredge up.

Interesting.

? I'm confused, it looks like you might be reading some deep meaning into my wording?

shshaw commented 4 years ago

Author of Splitting.js here. The difficulty of implementation is an issue I'm sure, but ::nth-letter would allow for so many great possibilities especially if combined with a letter-index() counter and letter-count() total, akin to this proposal to get the index of children.

Bonus points if ::nth-word could be implemented as well.

You can see some of the really creative ways folks have utilized Splitting for text, and it's not isolated to decorative headings. More use cases can be found in GSAP with its SplitText plugin.

In the interim, I would love some proper accessibility support for splitting up text with s, as mentioned. I've tried several approaches in Splitting but can't find a right setup for keeping the text properly available to screenreaders. Perhaps aria-role="inline" that would treat an element as if it were not distinct from its siblings.

AmeliaBR commented 4 years ago

Worth mentioning that we've had lots of discussion in SVG about how to handle features that assign properties to individual characters (https://github.com/w3c/svgwg/issues/631). However, the conclusion we made there was to stict to predictable parsing (codepoints) over typographical elegance — an approach that isn't at all consistent with CSS ::first-letter.

faceless2 commented 4 years ago

See also https://github.com/w3c/svgwg/issues/537, and in particular the linked-to examples showing how complicated this is for some scripts, eg https://w3c.github.io/i18n-tests/quick-tests/svg-counting/svg-counting-001.

Crissov commented 4 years ago

Perhaps there should be a more extensive snapshot profile, i. e. as a separate specification, that also included properties or values, but as a selector, ::nth-letter() – and ::last-letter and ::nth-last-letter() – would fit in very well with :has(). That would only address performance concerns, not solve any internationalization issues at all, of course.

SelenIT commented 4 years ago

Snapshot profile is dead (luckily), it's just the draft hasn't been updated accordingly yet (I'm going to make a relevant PR this weekend). IMO, there is already no problem with accessing any character of the text (as well as with selecting DOM elements basing on their contents) from JS APIs. It's styling these things that CSS developers have demanded for years but can't do yet.

shshaw commented 4 years ago

Accessing characters from JS is problematic when you get into Emoji or extended Unicode ranges where some “characters” are actually multiple, like the “family” emojis 👨‍👩‍👧‍👦

The current solution is a messy RegEx pattern matching specific ranges of characters (see some discussion here https://github.com/shshaw/Splitting/issues/25).Having a built in RegEx matched to target ligatures/combined characters Would be a huge help.

SelenIT commented 4 years ago

@shshaw well, ::first-letter also has its problems with emojis (at least in Gecko), hope it's just a bug...

ByteEater-pl commented 4 years ago

Thinking… Maybe that's a use case better suited for shadow DOM? If scripting is confined therein, authors reaching for reusable components needn't be concerned much. The text could be programatically split into text nodes in the light corresponding to grapheme clusters and spans inserted around each in the shadow. What remains is indexing – ::part and ::theme don't take arguments, maybe they should optionally to be resolved by script or some CSS variables in selectors machinery? Or is it Houdini's domain whereby an actual ::nth-letter() selector working in tandem with shadow DOM could be added (in which case I'd rather make it more general, e.g. ::nth-specially-marked-item-in-container())?

severdia commented 3 years ago

Here's a use case for ::nth-letter()(or at least second letter). When trying to reproduce historical texts like Shakespeare, initial letter doesn't work for several reasons:

There are hard line breaks in poetry and the dropcap should span two lines (or more in many cases) and it's not recognized (  doesn't work for semantic reasons)
Sometimes the dropcap is two letters (see below).
The next letter after the dropcap is often capitalized, but not always.

css

If there's reason enough to support dropcaps, there should be reason enough to support them correctly, not superficially.

faceless2 commented 3 years ago

You could do this with initial-letter to enclose the two "V" characters in a span - (or making it a "W" and just using an appropriate font...) So I'm not sure this is a case for nth-letter

However, here's an idea. How about we introduce control over which letters are considered part of the first-letter by using Unicode joiners? We could we specify that the first-letter pseudo-element must not break at a ZWJ and must break at ZWNJ.

That would allow us to support this example with V&zwj;VHen to the Sessions..., and also cover quite a few of the edge cases that seem to be coming up with punctuation and first-letter, without any additional CSS properties. I suspect this would be pretty easy to implement too.

severdia commented 3 years ago

It's possible to ask font designers to include these kinds of examples as single characters, but not likely to happen.

I tried initial-letter with a  and it doesn't work with two letters, nor does it work on inline elements (only block elements, which breaks the dropcap formatting). Here's a Codepen: https://codepen.io/severdia/pen/BapMMzx

I think it's an interesting idea to control it with a Unicode word joiner. That should work for all the use cases I can think of. Thanks!

johannesmutter commented 2 years ago

As for ::nth-letter, I am not aware of any substitute, but neither am I aware of strong use cases.

A common use case for nth-letter/ nth-word / etc. are code or more general text editors and annotations for grammar, syntax, spelling, entities (NLP), etc. Which often require overlapping markup (see standoff properties / annotations for more).

In above use cases styling the characters is insufficient: You will also want to register event listener, arbitrary data, … on it. So I’m not convinced an extension of the CSS spec would be a good idea.

Except for using canvas/ CSS Houdini, I currently don’t see a good alternative to using a million span tags that wrap each character. To keep the markup readable I'm storing the plain text string in a data attribute of e.g. a paragraph tag. (however markup readability shouldn’t be a concern if you can inspect a readable JSON of the content)

<p aria-description="Hello World">
  <span annotations={...}>H</span>
  <span annotations={...}>e</span>
  <span annotations={...}>l</span>
  <span annotations={...}>l</span>
  <span annotations={...}>o</span>
...
</p>

standoff markup and annotations

w3c / csswg-drafts

[css-pseudo-4] `::nth-letter` pseudo-element #3208