WICG / canvas-formatted-text

Other
82 stars 17 forks source link

Consider stronger separation of concerns in the data model #27

Open yjbanov opened 3 years ago

yjbanov commented 3 years ago

The data model proposes to use CSS as the source of text properties for layout and styling. Some of the CSS properties only take effect later in the formatting and rendering pipeline. Specifying them too early has a number of ergonomic and performance disadvantages. The runtime semantics of CSS will also likely increase the complexity of implementation.

Ergonomic disadvantages

Performance disadvantages

Complexity of implementation

Proposal

tl;dr Start with a pure JavaScript API emphasizing performance over convenience, with no dependency on CSS, DOM, or anything else that implies being on the main UI thread. Add CSS and/or DOM integration as a layer on top of the core JS API. This second layer is available in the main UI thread where document and the CSS engine are present, it adds developer conveniences, such as ability to apply styles via CSS selectors, and compatibility with existing HTML-based web frameworks.

Core layer: pure JavaScript API

This layer is sufficient for the following use-cases:

A common theme in these use-cases is that none of them use CSS and HTML as the rendering technology for the UI. For cases that render into WebGL the code may actually run in a web worker and composited via an OffscreenCanvas.

Formatting text has two distinct parts: layout and rendering. The layout part is useful on its own, without rendering. For example, the output of layout can be used to hit-test the text, and to compute the mapping of a scrollbar position to the scroll offset in the content. Rendering does depend on layout, but multiple rendering backends are possible (DOM, canvas, WebGL, server-side), and some of them may want to support alternative layout systems, as well as temporally disconnect layout from rendering (e.g. across threads/workers, across events, shared cache). Let's separate the layout concern from the rendering concern.

The layout API does not have any opinion about rendering properties, whether HTML, Canvas2D, or WebGL. Its API only specifies properties relevant to text layout. The layout API should support annotating text runs with arbitrary data. This data is passed through the layout process and it assigned to the laid out text fragments. A particular UI toolkit annotates text runs with data as it sees fit.

Example:

let text = new FormattedText();
let run = new FormattedTextRun("hello");

// Strictly typed properties relevant to layout
let style = new FormattedTextStyle();
style.fontSize = 12;
style.fontWeight = "bold";
style.letterSpacing = 4;
run.style = style;

// Attach object that's interesting to the rendering system to
// be used to paint this text, but has no effect on text layout.
// The app and/or UI toolkit decides what goes here. It's optional.
run.annotation = {
  color: new MyColor({ red: 0, green: 255, blue: 0, opacity: 0.5}),
};

text.textruns.push(run);

This is more ergonomic because of the simplicity of the API. It is clear about what's relevant to layout and what's not. There's no confusion about the subset of CSS features supported by this API (e.g. CSS matchers, "cascading properties", "property inheritance", "computed style").

This is more performant because it does not require a CSS engine or data conversion. Animation can be cheaply expressed by attaching the animation object instead of a specific color value. The text does not need to be rebuilt or relaid out. FormattedTextStyle contains the layout properties (and only layout properties) of a text. For extra efficiency, the same FormattedTextStyle object can be reused across multiple text runs, and across multiple instances of FormattedText.

The implementation is simpler (and smaller) as there is no need for a CSS engine, particularly in the web worker use-case. An additional benefit of not requiring a CSS engine is that this makes this API viable outside the browser, e.g. Node.js.

CSS layer: integration with HTML DOM on UI thread

This layer can be used by apps that use HTML and CSS for UI rendering. In this case, the input into the system is the CSS/HTML DOM created by the app (typically with the assistance of an HTML-based framework such as React, Vue, Angular, etc). In this mode it is safe to assume the presence of an HTML/CSS engine, as well as the document top-level variable.

When running on the main UI thread, and when building the UI on top of HTML and CSS, an additional API layer is available. This layer allows applying styles that cascade through the document. The API would need to support supplying styles using the CSS syntax (as described in the current proposal).

In addition, the API should allow reading back a FormattedTextParagraph from an HTML element containing (triggering the necessary style recalc and page layout in the process, similar to how getBoundingClientRect does it).

yjbanov commented 3 years ago

Edit: added FormattedTextStyle for extra efficiency

yjbanov commented 3 years ago

Edit 2: split the API into 2 layers

travisleithead commented 2 years ago

In #39, I borrowed heavily from the FormattedTextStyle concept here, except that I did not use it as an opportunity to re-describe text properties. It should serve the benefit of re-usability as you describe, which is good.

This is more performant because it does not require a CSS engine or data conversion.

So much of this feature must already depend on the existing layout engine in order to implement the layout/formatting required (unless we wanted to write a new inline layout engine from scratch). So, data conversion would be needed regardless. Since CSS is already well-understood by developers, we're continuing to pursue CSS as the style input language (despite many of the properties being irrelevant to inline layout scenarios). The FormattedTextStyle makes it possible to only pay the CSS parser engine cost once and re-use multiple times, which I love.

Providing animation values is an interesting use case that is not fully fleshed out yet. I had some daydreams in issue https://github.com/WICG/canvas-formatted-text/issues/44.

The layout API should support annotating text runs with arbitrary data. This data is passed through the layout process and it assigned to the laid out text fragments. A particular UI toolkit annotates text runs with data as it sees fit.

I'd like to understand the scenario for this pass-through requirement more. run.annotation can be added after the fact fairly easy, assuming we provide the right input->output mapping associate which is a requirement for many text-editing scenarios.