getodk / web-forms

ODK Web Forms enables form filling and submission editing of ODK forms in a web browser. It's coming soon! ✨
https://getodk.org
Apache License 2.0
12 stars 9 forks source link

Design: Markdown in `label`/`hint` (and `TextRange` engine/client API generally) #198

Open eyelidlessness opened 3 months ago

eyelidlessness commented 3 months ago

This issue is intended to support #62, and will cover both:

  1. The engine/client interface approach
  2. The specific approach to implementation in the engine

These don't necessarily need to be coupled, but it will make quite a bit of sense for them to be for a first pass.

First, I'll list some assumptions about requirements for the feature.

Then in the spirit of including multiple design options to choose from, I'll discuss these options:

  1. Port Collect's implementation directly
  2. Use an established parser, produce structured data suitable for a client h function

Assumptions/requirements

Option 0: Port from Collect

I'm labeling this "option 0" because it's about as close as we're going to get to a "null option". This option has some implications:

  1. We'll inherit all of the quirks of the Collect implementation. This is good for consistency, but might have some drawbacks in terms of aligning with more conventional Markdown implementations and their expected behavior.
  2. The handling of <output> is separate from that implementation, and will require some special consideration.
  3. Clients must use and trust arbitrary HTML from the engine. This has more specific implications for:

    • safety: any flaw (e.g. XSS) potentially affects all clients supporting formatted text
    • flexibility: clients must do extra work to re-parse and re-process the formatted HTML to do anything other than render it exactly as produced by the engine
    • performance: any client which might benefit from fine-grained updates will lose that capability for Markdown-formatted text from the engine

From a client perspective, this option would be consumed as:

interface TextRange {
   /* ... */
-  get formatted(): unknown;
+  get formatted(): string; // Arbitrary transformed blob of HTML
}

Option 1: Established parser, structured format, h

Some clarification of `h` We've discussed this in some chats/meetings, but I think detailing it here is a good opportunity to make the thinking behind this option clear for posterity—and as a potential reference point for hypothetical future clients on other platforms. The so-called `h` (or "hyperscript") function is a semi-formalism of the concept that programmatic generation of structured markup tends to follow a common pattern: `h(elementName, properties, ...children)` (though the signature can vary by implementation). This concept is effectively used in some form or another, to varying degrees, by nearly all of the currently popular web frameworks—including those where authoring is done in vanilla JS, as well as many compile-to-JS syntax extensions like JSX, and many other compile-to-JS languages. It's even used by, or compatible with, many non-web UI solutions for other platforms. It is effectively the underlying concept behind nearly all JSX implementations (including Vue's, React and Preact, Solid without its custom `dom-expressions` transform). It is also the underlying runtime concept used internally by the more idiomatic Vue SFC template language.

This option would entail processing Markdown with an established parser of our choosing.

Which parser?

Based on my research and a fairly thorough prototype of this proposal, I think mdast-util-from-markdown is an excellent candidate. This parses Markdown into an AST, with the same parser used by:

[... snip ...] This list could go on and on.

It's also worth considering some other parsers. Insofar as we're not migrating our XPath parsing off tree-sitter, that's a valid option (likely at the cost of page weight). Some other JS-based Markdown parsers at least plausibly claim to be faster, but in my experience they will have greater integration challenges.

Whichever parser we choose, we'd have a Markdown processing pipeline that looks roughly like:

  1. parse(markdownText) -> AST, where the parser-produced AST is likely broader than the Markdown subset we'll support
  2. walk(AST) -> StructuredFormat, where we map aspects of the parser-produced AST either to our own Markdown-subset representation; in some cases, we'd map unsupported Markdown functionality back to its corresponding raw source text (thus achieving our Markdown subset)

Structured format

The format structure I'd propose would roughly resemble a very simple, minimal "VNode" (as in "virtual DOM node") tree of elements. We can choose an interface specifically suitable for a particular client framework (i.e. Vue). Or we can choose a more general structure of our own design, which would impose a small amount of mapping duty on all clients. I don't feel very strongly about either, they both have their benefits and drawbacks.

This is not intended to be proscriptive about the structure, but it captures the essential concept:

interface MarkdownElement {
  elementName: string;
  properties: Record<string, unknown>;
  children: MarkdownChild[];
}

type MarkdownChild = MarkdownElement | string;

However, this is more general than necessary. We know we will support a very specific subset of Markdown, so we can be more detailed about what that subset will look like for clients:

Detailed element interface examples ```ts interface MarkdownHeadingElement { elementName: 'h1' | 'h2' | 'h3' /* | ...? */; properties: EmptyObject; // Assume such a type exists 🙃; or: `{ lang: string }` children: [string]; // Consistent with Collect } interface MarkdownParagraphElement { elementName: 'p'; properties: EmptyObject; // Or: `{ lang: string }` children: MarkdownInlineChild[]; } type MarkdownBlockElement = | MarkdownHeadingElement | MarkdownParagraphElement; interface MarkdownOutputElement { // Note: clients can choose to produce an `` in HTML, or just unwrap its string value. elementName: 'output'; // Note: while XForms and HTML `` are semantically similar, XForms' `value` attribute // doesn't map very well to HTML's `for` attribute. properties: EmptyObject; children: [string]; } interface MarkdownStyledElement { elementName: 'span'; properties: { style: { color?: string; 'font-face'?: string; }; }; children: MarkdownInlineChild[]; } interface MarkdownEmphasisElement { elementName: 'em' | 'strong'; properties: EmptyObject; children: MarkdownInlineChild[]; } interface MarkdownLinkElement { elementName: 'a'; properties: { href: string; // Maybe also: `target: '_blank';` }; children: MarkdownInlineChild[]; } type MarkdownInlineChild = | MarkdownOutputElement | MarkdownStyledElement | MarkdownEmphasisElement | MarkdownLinkElement | string; ```

This would be consumed by clients as:

interface TextRange {
   /* ... */
-  get formatted(): unknown;
+  get formatted(): MarkdownElement[]; // Or MarkdownBlockElement[] from the more detailed examples
}

Advantages of this approach

Option 1b: option 1, but apply subset of Markdown in clients

This would be basically the same as option 1, except clients would have:

Option 1c: option 1 (or 1b) + HTML serialization in the engine

While I want to discourage producing and consuming arbitrary blobs of HTML, I do recognize that it has some appealing conveniences for some use cases. We can consider extending option 1 to include both the structured format as well as an HTML serialization of it. For a client, this would look like:


interface TextRange {
   /* ... */
-  get formatted(): unknown;
+  get formatted(): MarkdownElement[]; // Or MarkdownBlockElement[]
+  get asHTML(): string; // Consider: `unsafe_asHTML` or some other discouraging name
   get asString(): string;
}
eyelidlessness commented 2 months ago

Added wrinkle for Option 0: part of Collect's behavior is determined by passing the output of markdownToHtml to Html.fromHtml (which is an Android API, which in turn is a TagSoup API).

This came up when I noticed that there must be implicit behavior for &nbsp;, and presumably HTML/XML entities more generally. This is something we could also ape, but I have some pretty serious reservations about implicit passthrough of HTML without constraints beyond the regex portion of the show.