mustache / spec

The Mustache spec.
MIT License
364 stars 71 forks source link

Whitespace clarification issue #97

Closed remorse closed 10 months ago

remorse commented 8 years ago

Hi! It isn't clear from the spec whether whitespace is allowed within the tag in cases other than simple substitution. That is, is this legal?

{{ #thing }}

How about this?

{{ & replacement }}

There are tests in the spec for ensuring that {{ something }} works, but nothing about {{ #something }}{{ /something }}, and at least one implementation doesn't accept the extra space, but does pass the tests.

My issue is that I am in a situation where I want to use different delimiters (I want to make them look like xml processing instructions, so some automated tools can work with the raw templates properly), and

<?m #something ?>

is not being accepted -- only

<?m#something?>

, which is ugly...

Edit: I'm asking partly because the documentation file for Mustache implies that you can make this change; their example for "Set delimiter" changes to "erb style tags", which I've always written with whitespace around the contents.

Thanks, Ricky

kaoru commented 8 years ago

According to the comment at the top of t/specs/sections.yml section tags _MUST be a non-whitespace character sequence_.

remorse commented 8 years ago

Hm. The reference implementation though doesn't enforce this -- at least, the demo on http://mustache.github.io/#demo allows you to have whitespace between the '{'s and the '#' or '/'.

Also, this is obviated by a test in that same file. The last test listed (name "Padding") allows {{# boolean }}, which includes whitespace characters within the tag. So either that test is invalid, or the documentation at the top is incorrect.

(sorry, multiple edits.)

kaoru commented 8 years ago

Huh, you're right... the whole test case is:

Whitespace Insensitivity

- name: Padding
  desc: Superfluous in-tag whitespace should be ignored.
  data: { boolean: true }
  template: '|{{# boolean }}={{/ boolean }}|'
  expected: '|=|'

And the "desc" key description definitely implies that all whitespace should be ignored.

I guess we'll see if anybody from the Mustache team comes back with a comment :-)

groue commented 8 years ago

I'm not a mustache maintainer, I only developed https://github.com/groue/GRMustache.

My experience on the subject tells me that mustache tokens are and should be white-space free: {{, {{<, {{#, etc. When you change delimiters this can give <%, <%<, <%#, etc.

The goal is to avoid ambiguous syntax like {{ <foo }}. Is it a partial named foo, or a variable tag containing the identifier <foo? The parser has to choose. A good solution is to start ignoring white space only after the mustache token has been identified. Here it is thus a variable tag containing the identifier <foo. Some Mustache implementations go even further in clarity and prevent identifiers from starting with a reserved character, and yield a syntax error for {{ <foo}}, claiming that <foo is not a valid identifier. Most importantly this prevents users to think they are writing a partial tag when they are not.

So I advise against liberal white space, and regret that the spec is so shallow on the subject.

Now I don't expect any clarification of the spec, which is stuck for years: in the end, do as you wish.

remorse commented 8 years ago

I'm afraid I have to disagree. Having {{#foo}}, {{ #foo}}, {{# foo}} (and possibly {{ #foo }}) all mean different things seems like it would cause all kinds of problems. The spec should probably require that identifiers cannot start with any of the reserved characters. But I'm not trying to argue for that, but rather that {{#foo}} and {{ #foo }} should not be different. Mustache.js already allows this, so either it is incorrect, or the spec is incorrect in not specifying this to be allowed.

In particular, I want to make the tags look like XML processing instructions, so that the templates can go through automated tools that reformat or transform them. The XML spec says that the format is:

'<?' PI-Target (' ' PI-Content)? '?>'

That is, there is a target for the instruction, and then optionally whitespace and additional content. By the logic currently in use, I would have a different PI-Target for each different starting character.

If this behaviour of not allowing a space between the delimiters and the initial character in intended, I can go look at other templating systems. But it seems to me that it isn't intended, just not explicit. In which case, it would be nice to get a ruling (either way).

Danappelxx commented 3 years ago

Hi! Sorry for reviving, but if there is interest in adding an explicit rule for this in the spec, please feel free to comment.

gasche commented 3 years ago

The implementation I am familiar with, ocaml-mustache, trims whitespace around tag names, but expects the "tag kind" marker (#, &, <, > etc) to come right after the opening delimiter, without whitespace in-between. This is similar to what @groue reports for GRMustache.

The mustache(5) manpage has example with whitespace aroudn the tag names ({{> user}}, {{ default_tags_again }}), but none with whitespace before the tag marker. The specification is similiar (note: when it says that the tag name must be non-whitespace, as pointed out by @kaoru above, I understand it to mean that it must not be "all whitespace", a blank string).

I believe that this suggests that most implementations are likely to support spaces around tag names, but not before the tag marker. Changing the spec right now to allow whitespace before the tag marker would then make implementations non-conformant, so it should at best be handled very carefully. Personally I am not convinced that the proposed use-case is strong enough to require changes in several implementations in this way.

remorse commented 3 years ago

Hi! I still think this is something that should be allowed. As soon as you move away from {{ as the opening delimiter, it starts to look confusing ([%/ id %] vs [% / id %]; <%# foo %> vs <% # foo %>, and more — all of these are easier to see visually when the delimiter is separable from the kind marker).

As of when I filed this issue, the specification was silent on whether this should be allowed or not. So any implementation that assumed it was disallowed is equally as invalid as any specification that assumed it should be allowed. As the spec is versioned, if this is seen as something that will break implementations to support, I think it should be allowed in a new version (I think this would be a minor update, as it's clarifying something that was not already specified), which is the point of versioning...

jgonggrijp commented 3 years ago

I side with @groue and @gasche that a space between the opening delimiter and the indicator (I suggest this name for #, >, ! etcetera) should not be allowed. By sticking the indicator directly to the opening delimiter, you make it very clear what you mean:

{{>
    extremelyAbsurdlyReallyVeryLongPartialName
}}

Things get much more ambiguous if you allow whitespace in between, because in that case the indicator might even be on another line:

{{
    >extremelyAbsurdlyReallyVeryLongPartialName
}}

Finding only {{ initially makes me expect an interpolation, not a partial.

remorse commented 3 years ago

Just to be clear, I'm not suggesting that a space be required, only that it be allowed. If you don't want any space between the delimiter and the indicator, don't put any.

Also, if people are inserting newlines into their tags, they may find it clearer to have the indicator on the same line as the token. For instance:

This is a really long paragraph where I am going to have a ton of things involved in the middle {{ foo }} of the line, and then I need to indicate that I'm starting a list, but I don't want to have a newline, so I put it here {{
    # barg
}} and {{ item }} {{
    / barg
}} and continue.

That seems actually kind of useful to me, as it makes it obvious there's some stuff going on it the middle of the sentence. Otherwise, it kind of runs off the edge of the screen, and you might miss it.

Also, having dealt with this for a while (until I hacked a local copy of the mustache templating library so that I can do it the way I want), with the double-braces, it's not really a problem. It's when you're trying to fit in with other tools that want delimiters like '' (this actually isn't possible without hacking in the space, as html comments require a space), '<?temp ?>', '[+ +]', etc., that it becomes easier to understand what's going on if the indicator is separate from the delimiter.

remorse commented 3 years ago

(Sorry, those first delimiters are '<!-- -->', which don't show up in the posted comment. And because I forgot to backtick them, the others are '<?temp ?>', '[+ +]', and things like '<% %>'.)

jgonggrijp commented 3 years ago

You make valid points about the long lines and the HTML comments, but I think there are other solutions available, which I'd prefer. For the long lines:

This is a really long paragraph where I am going to have a ton of things{{!
}} involved in the middle {{ foo }} of the line, and then I need to indicate{{!
}} that I'm starting a list, but I don't want to have a newline, so I put it{{!
}} here {{# barg }} and {{ item }} {{/ barg }} and continue.

For the HTML comments, an alternative solution will depend on your reason for wanting to use <!-- --> as delimiters. Why do you want to do this?

Regarding the ERB-style <% %> delimiters, I see no problem with sticking the indicator directly to the opening delimiter, since this is already common practice in ERB-like systems. I can't comment on the other examples that you gave because I don't know them. In general, however, if the editor of the template is aware that she's dealing with a Mustache template (which I hope she is), then I don't see why sequences like <?temp# foo ?> or [+> bar +] would be any more problematic than {{# foo }} and {{> bar }}.

When you write "It's when you're trying to fit in with other tools that want delimiters like ...", I'm getting the impression that you are using the same delimiters as some other target/source language. I think the intention of {{= =}} is to enable you to use different delimiters so you can make a clean separation between languages.

remorse commented 3 years ago

So, the "solution" you provided for the long lines is something you prefer, which great! You like it better. My response was along the lines of why other people might prefer something different (and was completely invented as an example, not something I've done).

My original issue was that I would like to add the space because it makes it easier for me to read things. I'm not trying to mandate the space, but have it for those who want it!

The advantage of other delimiters (in particular, the html processing instruction one) is that automated code-formatting tools can understand them as things that should be handled in the right way (for whatever the local definition of "right" is).. (And, to be honest, I only tried HTML comments once, and found that the processing directive was a better one.)

Yes, maybe you are fine without the space. But I found it annoying to read. Hacking in the space makes it easier for me.

Right now, we are both arguing about what we like better. The only actual concern I have seen here that is not a matter of presentation and preference is that various existing tools don't accept the space (which again, the spec does not mention either way whether it should be allowed or disallowed, so they are just as out-of-compliance as a tool that does allow it). But this is why the spec is versioned! And the change is backwards-compatible — nothing that was previously accepted is suddenly invalid.

Danappelxx commented 3 years ago

For what it's worth, MuttonChop allows for whitespace in the tag, and only cares about the first-non-whitespace character. Can we do a survey of Mustache implementations to see how breaking of a change this would be?

jgonggrijp commented 3 years ago

My own implementation doesn't allow it at this time, but allowing it would not be difficult. I was hoping to avoid this, however, because it also means that variable names can no longer start with #, >, <, /, ^, !, $ or & (or begin with { and end with }). This is not so much a breaking change for implementations but for users that rely on it. This argument was already given by others, so I didn't repeat it until now.

jgonggrijp commented 10 months ago

The original question, "where in a tag does the spec currently allow whitespace", seems to have been answered. For the other question, "where else should the spec perhaps allow whitespace", I suggestion opening a new discussion. Closing this ticket.