w3c / csswg-drafts

CSS Working Group Editor Drafts
https://drafts.csswg.org/
Other
4.46k stars 657 forks source link

[css-text] Show invisible characters #8874

Open Crissov opened 1 year ago

Crissov commented 1 year ago

Especially for rendering source code and editing text, non-printing whitespace and control characters should optionally be shown ”with ink“.

Although these sometimes use a deemphasized, grayish color, I think this would best be done within text-transform or perhaps even in a new sub-property. Since @text-transform #3132 has not gotten anywhere yet, I‘m proposing new bikesheddable values:

PS: It is quite possible this was discussed before, but I could not find a respective issue, so it might have been only on www-style long ago.

johannesodland commented 1 year ago

Great proposal! That would be helpful.

Including U+00A0 (No-Break Space) and U+00AD (Soft Hyphen) would also be beneficial, but I’m not sure how they should be represented.

tabatkins commented 1 year ago

A wrinkle here is that showing the characters shouldn't affect the actual text formatting, especially the linebreak ones. They need to be displayed as purely visual overlays, rather than like normal characters.

Crissov commented 1 year ago

I think it is sometimes acceptable or even expected that visibly showing space characters would affect spacing, probably also depending on the settings for whitespace collapsing, i.e. does substitution happen before or after collapsing.

frivoal commented 1 year ago

I don't think it'd be appropriate to text-transform a line break into a character that visually indicates that there should be a line break, but doesn't actually provide one. All in all, the feature seams useful, but it's not clear to me that text-transform is the right property for this.

faceless2 commented 1 year ago

This is the "I want to do complex replacements on text" problem we've seen a few times before. https://github.com/w3c/csswg-drafts/issues/4875

johannesodland commented 1 year ago

To further elaborate on the “text editing” use case:

When editing text for publication it is necessary to be able to insert and inspect invisible characters. Without this capability, such characters may be inadvertently omitted, compromising the quality of the final text.

While tools like Word provide solutions to view invisible characters, the web lacks a standardized method. Users and publications rely on every product implementing a custom solution for text replacements, but support is mixed. This missing support impacts language and typography considerably.

For instance, in Norwegian, a space serves as a thousands separator. To prevent line breaks within numbers, we utilize a no-break space, e.g., “1 000 000”. However, the challenges in inputting and inspecting the nbsp have led many web publications to deviate from standard Norwegian typography, often using a full stop instead.

Furthermore, languages like Norwegian frequently use long compound words. Here, soft hyphens are vital to allow soft breaks and prevent unsightly text wrapping. Yet, the inability to visualise these soft hyphens means many publications lack proper text breaks. Sometimes a single word can be left on the preceding line as a whole compound word is pushed to the next.

For editing purposes it is not a problem that the visualised characters impact the formatting. Furthermore, this use case can be solved by other means than full "complex replacement on text".

Without improved support for viewing invisible characters, the web could inadvertently alter local typographical standards, much like the typewriter did in its era.

frivoal commented 1 year ago

No question it's useful in an editor. The more pressing question, in my mind, is whether that should be provided by the web platform itself, or whether this is to be handled by the web-based editor itself. We have a whole history of tension between trying to provide high level features baked into the platform for editing, and providing the right primitives to enable (js-based) editors to built such features effectively.

johannesodland commented 1 year ago

I agree that many features are better left to higher level components like editors.

However, the reality is that it is not feasible to expect every framework, editor component and cms to implement this functionality.

The content we published are sourced from a multitude of systems. We have requested this feature from the cms’es we’re using. The solutions they provide are either lackluster or non existing. In addition there’s a large number of custom form fields and editors that would all have to be customized to provide this functionality.

It feels rather hopeless.

The end result is that our language and typography is slowly changing due to poor technological support.

Having a standardized platform provided way of displaying otherwise invisible characters would go a long way towards amending this.

SebastianZ commented 1 year ago

We have a whole history of tension between trying to provide high level features baked into the platform for editing, and providing the right primitives to enable (js-based) editors to built such features effectively.

And as @tabatkins pointed out earlier, this is purely a visualizing feature. Though as @johannesodland wrote, it should also be capable of displaying special characters like soft hyphens, zero-width spaces (U+200B, U+FEFF), etc. So it might affect text formatting in some cases. To me, showing invisible characters seems a natural fit for CSS. And it is something where CSS should pave the cowpath.

Because of the requirement of being a purely visual effect, I also think text-transform may not be the right fit. Maybe a new text-visibility or character-visibility property or something along those lines.

A few questions that come to my mind:

Sebastian

Crissov commented 8 months ago

I think there’s also a math use case with these:

Within (editor) text set with a monospaced font, it may be desirable to disambiguate the various spaces in U+200x and a handful others by showing visible glyphs for them. Since these are not readily available as Unicode characters, text-transform may indeed be the wrong place to implement this. The same goes for many (keycap) symbols from ISO 9995, ISO 7000 or IEC 60417 which are available for characters like ZWJ, ZWNJ, CGJ, NBSP.

* U+200B ZERO WIDTH SPACE ![image](https://github.com/w3c/csswg-drafts/assets/6200185/7a1d1ca7-b481-4f21-93b6-e33710345076) * U+00AD SOFT HYPHEN ![image](https://github.com/w3c/csswg-drafts/assets/6200185/b1183e2d-a1cc-41f8-873a-72cf6f281010) * U+200C ZERO WIDTH NONJOINER ![image](https://github.com/w3c/csswg-drafts/assets/6200185/03645cd8-e13c-4dc0-9172-c0e80b9ddbcc) * U+2060 WORD JOINER ![image](https://github.com/w3c/csswg-drafts/assets/6200185/8af0a97b-a477-4610-b956-47bcd97e88f2) * U+034F COMBINING GRAPHEME JOINER ![image](https://github.com/w3c/csswg-drafts/assets/6200185/ae2815cc-b7f5-498e-8f9e-cf6f97424c6d) * U+0020 SPACE ![image](https://github.com/w3c/csswg-drafts/assets/6200185/5f1cad5b-d1d0-4ca5-b8e4-06cd4a4e4806) * U+00 NON-BREAKING SPACE ![image](https://github.com/w3c/csswg-drafts/assets/6200185/e10e6ba3-7102-4bc1-b16f-a91c796ac247)
oliviercailloux commented 1 month ago

Maybe a silly idea, coming from a Unicode FAQ. Perhaps the rendering system might allow to opt for a special rendering mode where invisible characters would actually display simply by using a font that contains visible glyphs for normally invisible characters? So that instead of “replacing” U+0020 SPACE by U+2423 ␣, the rendering would simply use a font containing a glyph for U+0020 that represents a space, similar to ␣.