emberjs / rfcs

RFCs for changes to Ember
https://rfcs.emberjs.com/
791 stars 406 forks source link

Whitespace handling for `<template>` tag #982

Open chancancode opened 1 year ago

chancancode commented 1 year ago

https://github.com/emberjs/rfcs/pull/779 does not specify the whitespace handling semantics for <template> tag, as a result the current implementation leaves all the whitespace intact as they appear in the source code. This causes several refactoring hazards:

  1. when migrating from .hbs the indentation level likely will change, since <template> is typically nested inside a class, and even for template-only component, the content is customarily indented after the opening <template> tag
  2. when refactoring code, say, nesting things inside an if statement, it changes the indentation level thus the amount of whitespace inside the template tag

It appears that there are broad consensus that this is a problem, and at a minimum, we need to strip the leading indentation.

However, I would like to go a step further, and consider taking the opportunity and collapsing whitespaces by default.

The reason we care about preserving whitespace is for HTML compatibility. The guiding principle for Ember template is that "things that look like HTML should behave like HTML", however, I am not sure if in this particular case it is worth it. Generally, the browser collapses whitespace so it rarely matters, but in the cases where it does, I think the extra leading indentation from the <template> tag will make things too finicky/confusing to rely on whether we strip leading indention or not.

Since the <template> tag makes it easy to access the JavaScript scope, a more reliable and more readable thing to do, IMO, would be to do this in JavaScript:

import { action } from "@ember/object";
import Component from "@glimmer/component";
import { stripIndent } from "common-tags"

const STATIC_EXAMPLE = stripIndent`
  function greeting() {
    console.log(
      "hello, world!"
    );
  }
`;

class ExampleCodeComponent {
  @action dynamicExample(defaultName = "human") {
    return stripIndent`
      function personalizedGreeting(name = ${JSON.stringify(defaultName)}) {
        console.log(
          "hello, \${name}!"
        );
      }
    `;
  }

  <template>
    <p>
      Static example:
      <pre><code>{{STATIC_EXAMPLE}}</code></pre>
    </p>
    <p>
      Dynamic example:
      <pre><code>{{this.dynamicExample @user.name}}</code></pre>
    </p>
  </template>
}

Since you can always extract any whitespace sensitive text into JavaScript, I think that is sufficient for the white-space: pre kind of use cases.

So I think we can specify that whitespace <template> has white-space: normal semantics by default – or more precisely template("...", { ... }) apply the equivalent of white-space: normal collapsing to its content by default.

In addition to the regular HTML/browser white space collapsing semantics, we have to specify:

  1. what happens to the leading and trailing whitespace
    <template>
      [👈 these 👉]
    </templaet>

    becomes template("\n\x20\x20[👈 these 👉]\n\x20\x20")

  2. what happens around handlebars constructs
    <template>
      [these 👉]
        {{#if this.isAdmin}}
          [👈 and these 👉]
        {{/if}}
      [👈 and these]
    </templaet>

    becomes template("\n\x20\x20[these 👉]\n\x20\x20\x20\x20{{#if this.isAdmin}}\n\x20\x20\x20\x20\x20\x20[👈 and these 👉]\n\x20\x20\x20\x20{{/if}}\n\x20\x20[👈 and these]\n")

  3. the browser algorithm is supposed to be language dependent In languages that separate words with spaces, such as English:
    <p>
      Here is an English paragraph
      that is broken into multiple lines
      in the source code so that it can
      be more easily read and edited
      in a text editor.
    </p>

    ...is intended to read as (with applicable soft wrap) "Here is an English paragraph that is broken into multiple lines in the source code so that it can be more easily read and edited in a text editor." Where the chunks joined by spaces. On the other hand, in languages that do not separate words, such as Chinese:

    <p>
      這個段落是那麼長,
      在一行寫不行。最好
      用三行寫。
    </p>

    ...is intended to read as "這個段落是那麼長,在一行寫不行。最好用三行寫。" ...which is what the CSS spec recommends. However, "historically" browsers unconditionally treated \n the same as spaces, which results in this well-known broken behavior: "這個段落是那麼長,\x20在一行寫不行。最好\x20用三行寫。" This behavior is ultimately left in the spec as browser defined. In my testing today, at least on Safari and Chrome they still have the "historical" broken behavior.

So, my proposal is:

  1. We .trim() the template text for maximal composability, i.e. the template content begins with and end at the first/last none-whitespace character. Invoking a component should not introduce extra whitespace by default, the caller can easily add whitespace (and typically do) on the outside as needed.
  2. We don't do anything special for handlebars but continue to respect handlebars whitespace control: {{~, {{~# etc.
  3. We normalize and collapse whitespaces within a line:
    1. \t is replaced by \x20,
    2. leading and trailing \x20 on the line are trimmed,
    3. multiple successive whitespace characters are replace with a single \x20.
  4. We normalize line breaks:
    1. \r\n becomes \n
    2. lone \r becomes \n
    3. multiple successive \n are replaced with a single \n
  5. We leave the \n alone for the browser to decide what to do

We should make it possible to overridable locally:

  1. template("...", { whitespace: "collapse" | "preserve" = "collapse")
  2. Something like:

    <template whitespace="collapse">...</template>
    <template whitespace="preserve">...</template>

    ..or if we are not ready to use the "attributes" space...

    // whitespace: collapse
    <template>...</template>
    
    // whitespace: preserve
    <template>...</template>

We could also consider having a way to change the global default, but it gets complicated:

  1. where do you put this config?
  2. is it global global, or is it scoped to the app/addon?
  3. does it get propagated to the runtime compiler if present?
NullVoxPopuli commented 1 year ago
cp issues/982 pulls/
simonihmig commented 1 year ago

This makes a lot of sense!

Sorry if I'm kinda hijacking this for another use case, but it is very much related: when doing DOM-less rendering, it would be super useful to also have something like <template whitespace="none">.

A simple imaginary example of 3D rendering:

<template whitespace="none">
  <ThreeJsCanvas>
    <Mesh/>
    <PerspectiveCamera/>
    <PointLight/>
  </ThreeJsCanvas>
</template>

<ThreeJsCanvas> would render a <canvas> element and do other setup work, everything else is not rendering any DOM but causing other rendering side effects like creating objects and adding them to the scene graph (which is currently difficult due to https://github.com/emberjs/rfcs/issues/597, which could be solved by https://github.com/emberjs/rfcs/pull/975, however I'm digressing...). In this scenario, all the white space would cause empty text nodes to be created as children of <canvas>, which is completely useless.

This gets even worse when doing more dynamic and complex things with e.g. {{#if}} and {{#each}}, where Glimmer would mindlessly create and remove whitespace text nodes dynamically, all within the hot path, where you just have 16ms for all the work to get to 60fps.

Note that this is equally true for classic hbs files as well as <template> currently, but the proposal here opens up a good opportunity to get that use case nicely supported I think!

chancancode commented 1 year ago

I was also thinking if we should at least support something like this for the Chinese (etc) use case:

<p>
  這個段落是那麼長,\
  在一行寫不行。最好\
  用三行寫。
</p>

But it further eats into the "is it still HTML" budget... plus I am not sure what would be the rationalization for this to have no spaces at all. Perhaps whitespace="none" works good enough here also, but I'm not sure.

chancancode commented 1 year ago

Also pointing out that if we were to do this, it would have to be a feature in the wire-format compiler, not a feature of content-tag, since you can only do it when you have parsed (knowing what's html text content, etc). Just stripping leading indentation, on the other hand, can be done in content-tag.