loomchild / theo-rails

Theo is a small and elegant HTML-like template language for Ruby on Rails, featuring natural partials and computed attributes.
20 stars 0 forks source link

Theo should be valid HTML (or almost valid) #5

Closed loomchild closed 1 week ago

loomchild commented 2 weeks ago

Background

The .theo syntax should be valid HTML, or at least close to it, so most parsers such as web browser or GitHub markdown processor, won't get confused. The current <_button> syntax is incorrect, since a custom element name must start with a letter (see reference). Also, the component <Button> syntax is technically invalid, since it starts with uppercase (but it seems to be parsed well).

Attribute name syntax, on the other hand, is more permissive (see reference), but for custom elements specifically it's more restrictive (see this reply, reference and XML spec).

Possible solutions for partials

1. Suffix

Originally, we used -partial syntax (e.g. <button-partial>), that produced valid custom elements thanks to a -. Partials are also unambiguous and easy to grep. They also guarantee that the partial name won't clash with any standard HTML tag names. On the other hand, it was too verbose (see #2).

We could shorten it, by allowing a -p suffix as equivalent for -partial. But this is not self-explicit.

Or perhaps better, use a special suffix symbol, such as @, % or _. Although it's not technically correct, many HTML parsers seem to accept it.

We could also allow the user to specify a suffix, similar to TailwindCSS prefix. It might be necessary in the future to cover edge cases, but I prefer to focus on default syntax for now, as it'll be the most widely used.

2. Valid prefix

We could use a valid, unique prefix, such as <theo:button /> or <partial:button />. It can be too verbose and make the code harder to read.

3. Marker attribute

We could add a special attribute to a partial tag, e.g. <button partial />. This will make the code shorter only if the element has content. On the other hand, it won't avoid name clash with a standard HTML element.

4. Implicit

Any unrecognized HTML element will be treated as partial. This produces the shortest syntax, and other frameworks such as Vue.js are doing it as well.

On the other hand, it can be ambiguous, especially since Rails encourages single-word partial names (e.g. _form in scaffold) and form helpers would clash with HTML form elements (label, select). We could combine this solution with a special attribute, such as theo-off or no-partial to distinguish them.

5. PascalCase

Require all partials to be invoked using pascal case, e.g <Button />. The convention of changing snake_case to PascalCase is well-known in Ruby and also used in other libraries. On the other hand, HTML parsers should be case-insensitive, so it still won't prevent name clashes with existing HTML elements (e.g. <Button>Click</Button> will be displayed as HTML button before being processed by Theo.

Possible solutions for ViewComponent components

1. Leave it as-is

The current syntax is readable and is well-supported by most parsers. On the other hand, it's very ViewComponent specific.

2. Implicit

Use the same syntax as for partials, but check whether a component is defined? before deciding whether render a component or normal partial.

Possible solutions for dynamic attributes

1. Leave it as is

If most parser support the current syntax, leave it as is, since it plays nice with ERB. I suppose it works because stricter rules apply only to custom elements, and parsers don't care to distinguish them from standard elements.

2. Use : prefix, similar to Vue.js

This will produce a valid custom element in HTML and will be easier to learn for developers coming from Vue.js. On the other hand, it will clash with Alpine.js shorthand syntax.

DanielJackson-Oslo commented 2 weeks ago

I disagree with the premise that it should be valid HTML.

ERB is not valid HTML. Phoenix components are not valid HTML. Laravel Blade is not valid HTML. Hell, even JSX from React is not valid HTML. (though it looks like it)

In fact, it's a bonus that it's not valid HTML, as that means there can be no conflict between Theo Partials and actual HTML.

If it is valid HTML, then you risk conflicts between f.ex. WebComponents and Partials.

loomchild commented 2 weeks ago

@DanielJackson-Oslo Thanks for jumping into a discussion so quickly, I really appreciate it! I added more detail to the description - please make sure to re-read it. Please note that I opened this issue in response to #4.

I agree with you that it doesn't have to be valid HTML. Furthermore, it shouldn't be, since as you said, it can clash with existing front-end technology like WebComponents, Alpine.js, Vue.js, etc. Even though it seems like an edge case to have the same front-end component and partial name - but it will still be difficult to distinguish them when reading the code.

On the other hand, the fact that most HTML parsers get confused, still bothers me.

What are your thoughts on making the syntax almost-valid, to keep most parsers happy (browsers, code highlight, etc.), but avoiding name conflicts? For example, I somewhat like the suffix idea (<button@ />).

loomchild commented 2 weeks ago

(I added "2. Valid prefix" solution above)

matthewblott commented 2 weeks ago

@loomchild that <button@ /> syntax just looks plain wrong to my eyes. I've done a lot of C# work and I got the idea of <theo:button /> from ASP.NET's tag helpers which is a great feature as it allows you to augment HTML with server side code but doesn't cause any design problems.

DanielJackson-Oslo commented 2 weeks ago

@matthewblott To me is OK, but not as nice as <_button />

I really like the parallell between the filename of a partial starting with underscore: "_button.html.theo" and using the very same convention in my markup.

Of course that means syntax highlighting breaks, but syntax highlighting will have to learn Theo anyway.

As far as I understand, you want a browser to be able to parse it, so you can preview the design and work on HTML+CSS without running Rails? How do you do that with ERB today?

Surely that's impossible for all but the most basic partials anyway?

Having it be might work, sort of, but <theo:list collection%="Model"> won't work, as it of course won't spit out the list. It will just look like a div?

DanielJackson-Oslo commented 2 weeks ago

@DanielJackson-Oslo Thanks for jumping into a discussion so quickly, I really appreciate it! I added more detail to the description - please make sure to re-read it. Please note that I opened this issue in response to #4.

I agree with you that it doesn't have to be valid HTML. Furthermore, it shouldn't be, since as you said, it can clash with existing front-end technology like WebComponents, Alpine.js, Vue.js, etc. Even though it seems like an edge case to have the same front-end component and partial name - but it will still be difficult to distinguish them when reading the code.

On the other hand, the fact that most HTML parsers get confused, still bothers me.

What are your thoughts on making the syntax almost-valid, to keep most parsers happy (browsers, code highlight, etc.), but avoiding name conflicts? For example, I somewhat like the suffix idea (<button@ />).

I think the main quest here should be to create a syntax highlighting library to go along with Theo, so you are free to create the best developer experience and most beautiful code, without having to be limited by the existing syntax highlighting.

To me the rationale of Theo is to make my view code beautiful, composable, and make me happy just by looking at it.

<_button> matching _button.hml.theo makes me happy in a way that <theo:button> or <button_> or <button-partial> doesn't.

I also really love that CamelCase is reserved for ViewComponents in the current version. That just feels so nice to go from the god awful ugly <%= render(MessageListComponent.new) %> to <MessageList />. My god it's SO MUCH NICER. And still using the same CamelCase as we would for a ViewComponent! I really love it.

That said, CamelCase would look nice for partials too, but I don't refer to my partials in CamelCase anywhere else. So while it would make it easier to transition from React/JSX, I still feel like <_message_list collection%="@messages"> is just a really nice parallel to "_message_list.html.theo"

matthewblott commented 2 weeks ago

Technically ThisIsCalledPascalCase and is named thus to avoid confusion with thisThatIsCamelCase. This distinction doesn't always matter but it means a lot more with front end templating.

This is really a design decision. I laid out my opposition to the underscore in #4 but not everyone will feel the same.

I assume there's agreement with using angle brackets? What is the issue with using PascalCase which would make Theo consistent with other component frameworks?

loomchild commented 2 weeks ago

Thanks a lot for sharing your opinions guys, I'll try to invite a few more people to express theirs and will decide in a week or two.

I am thinking that perhaps we could have two alternative syntaxes (similar to other libraries), but I prefer to avoid complicating things now, and I believe that the defaults are very important.

(cheers for clarifying the distinction between camelCase and PascalCase - I corrected the issue description)

loomchild commented 2 weeks ago

Regarding PascalCase - it will still be problematic when partial name clashes with HTML element name (e.g. <Form>, <Button>) and unprocessed .theo file will be rendered incorrectly. But perhaps it's not a big deal since nobody writes HTML in uppercase these days and we don't often render unprocessed files in a web browser.

matthewblott commented 2 weeks ago

I'm old enough to remember the earliest websites and the HTML was always written like this:

<TABLE>
  <TR>
    <TD><INPUT TYPE="TEXT" VALUE="One"></TD>
    <TD><INPUT TYPE="TEXT" VALUE="Two"></TD>
    <TD><INPUT TYPE="TEXT" VALUE="Buckle"></TD>
    <TD><INPUT TYPE="TEXT" VALUE="Shoe"></TD>
  </TR>
</TABLE>

I've never seen pascal case used for vanilla HTML, only ever a component framework like React or Svelte.

loomchild commented 1 week ago

I am currently thinking of supporting both notations:

But I would not explicitly distinguish ViewComponents from partials at this level and auto-detect them with defined?(ButtonComponent). I assume not many people mix components and partials in the same code base, and they can be further separated via partial / component attribute if someone wants to be explicit.

matthewblott commented 1 week ago

@loomchild sounds like a plan 🙂

DanielJackson-Oslo commented 1 week ago

I am currently thinking of supporting both notations:

  • <_button /> (for the aesthetic, if developer doesn't care about HTML parsers)
  • <Button /> (familiar for React and Vue.js developers and parser friendly)

I think you should only pick one. <Button /> is fine by me, even though I prefer <_button>, but either way I think it's better to just have one.

The menu is omakase, after all. One standard means one less decision to make.