groue / GRMustache

Flexible and production-ready Mustache templates for MacOS Cocoa and iOS
http://mustache.github.com/
MIT License
1.44k stars 190 forks source link

Escape query #42

Closed KosmicTask closed 11 years ago

KosmicTask commented 11 years ago

This may be a general issue with Mustache rather than something applicable to GRMustache.

Is it possible to disable HTML escaping entirely? I am not generating HTML and am seeing escapes whenever escapable characters appear in the text. I know I can use {{{ var }}} but it doesn't seem to work with {{# var? }}. Ideally I would turn off html escaping and use {{ }} everywhere. In my case users may modify their tasks using Mustache to produce task templates so having a simple {{ }} syntax would be a big plus.

csh template:

{{#task-input-index-1-based?}}set {{/}}{{task-input}}{{#task-input-id?}} {{.}}{{/}}{{#task-input-index-1-based?}} = "${{.}}"{{/}}

Required:

set taskInput1 = "$1"

Obtained:

set taskInput1 = "$1"
groue commented 11 years ago

Hi back!

I'm keen to discuss with you how GRMustache would disable HTML-escaping entirely.

However, one thing after another: let's first use the provided tools {{{...}}} and {{#...}} to build your expected rendering. You'll feel more comfortable with the library.

Based on your question and your example, I think it will be interesting that we explain the double-escaping of the quotes "$1". That is unexpected for sure, and I smell there the root of your issues. Do you have snippets of the code involved in this?

KosmicTask commented 11 years ago

Hi Gwendal

In order to help me sort out how I am messing things up I have added a simple test app to https://github.com/mugginsoft/GRMustache. You might want to pull this in - maybe not. I have added it as a subproject so that it exists pretty independently of the library code. I called it GRMustachio, an obvious play on I/O. I used MGS as a class prefix so there should be no confusion about what is and isn't library code.

A bit of experimenting shows that quoting does work as expected. My problem lies in the fact that I generally run the output of one template through another. I build a series of parameters with one template and then run each of those through another (I do this because the user can insert code fragments like variables and function calls as well as entire scripts). So, I can get generate the required output as long as I am strict about using {{{ }}} aka 3m, in my template hierarchy.

However, this means that end users will have to use 3m too. A failure to do so will result in the generation of an HTML escape sequence, which will look a little out of place in say JSCocoa or Py-ObjC. Not traumatic, just unexpected.

So for me, working outside HTML, it would simplify things if the library could suspend its HTML awareness. If you are open to incorporating that feature we can work on it.

groue commented 11 years ago

Thanks for this setup. Your use case is interesting. I'll look at your code this evening (it's noon right now).

groue commented 11 years ago

Please remember to tell me how to run GRMustachio, where to look for output, and what is the output you expect.

KosmicTask commented 11 years ago

Its just a simple 3 panel app. JSON object representation, template and rendered object. Same idea as http://mustache.github.com/#demo

KosmicTask commented 11 years ago

Looking at the code it seems that an obvious place to disable HTML escaping is:

GRMustacheCompiler.m

 - (BOOL)parser:(GRMustacheParser *)parser shouldContinueAfterParsingToken:(GRMustacheToken *)token

Here we process our tokens and escape as required.

It depends whether you want to try disable escaping at the level of the tag, template or compiler. Compiler level would be simplest and should give consist results when partials are loaded as everything will go through the compiler at some stage.

A simple implementation strategy would be to use a compiler singleton. The singleton flags whether HTML escaping is on or off and will effect every compiler instance. You have eschewed singletons so far, perhaps by design.

However GRMustacheCompiler is private and not a good candidate for making public. Perhaps we implement a GRMustacheConfiguration singleton.

enum {
    kGREscapeBehaviourHTML = 0,
    kGREscapeBehaviourNone = 1,
} 
typedef NSUInteger GREscapeBehaviour;

@property GREscapeBehaviour escapeBehaviour;

+(GRMustacheConfiguration *)sharedConfiguration {
     static dispatch_once_t pred;
    static MyClass *shared = nil;

   dispatch_once(&pred, ^{
      shared = [[MyClass alloc] init];
   });
   return shared;
}

Not sure whether that singleton dec is iPhone friendly or not. The compiler can then short circuit escaping for GRMustacheTokenTypeEscapedVariable:

// Success: append GRMustacheVariableTag
BOOL escape = [GRMustacheConfiguration sharedConfiguration].escapeBehaviour == kGREscapeBehaviourHTML ? YES : NO;
[_currentComponents addObject:[GRMustacheVariableTag variableTagWithTemplateRepository:_templateRepository expression:token.expression escapesHTML:escape]];
groue commented 11 years ago

Thank you for your suggestion on how to disable HTML-escaping, and your involvement in understanding the GRMustache internal guts. Few people do that, I appreciate that!

But please, if you still need my help, focus on what I asked: give me some sample code which shows how you get the double-escaping of the quotes "$1". This is soooo weird that I'm sure you are missing crucial information.

You present the disabling of HTML-escaping as unavoidable. I don't, and won't, follow you here. So let's first make sure you can render your templates with triple mustache. Then, and only then, we'll discuss the disabling of automatic HTML-escaping, so that you can use simple double mustache.

KosmicTask commented 11 years ago

Sorry. I didn't make myself clear before (somewhere above). Triple quoting works fine in my templates. I had failed to remember that I was pumping my data through multiple templates. Triple quoting all my templates works as it should. No issue (apart from my stupidity).

I would see the disabling of HTML escaping merely as a convenience for situations that do not involve HTML processing. I know my use case isn't common and I understand if you don't want to implement this feature.

groue commented 11 years ago

OK, I'm relieved :-)

We can discuss the disabling of HTML escaping now!

Please hold on, I'll be back soon.

KosmicTask commented 11 years ago

Sorry. I have to attend to something a bit less interesting. Talk later.

KosmicTask commented 11 years ago

What do you think of the idea of implementing a singleton configuration class?

If it is desirable I will implement it.

groue commented 11 years ago

OMG I'm not quite ready to really dig into this.

My early thoughts:

  1. avoid globals if possible (no singleton configuration. No, really)
  2. avoid objects that lie. Today, rendering methods return a string, and a HTMLSafe boolean. If this API survives, it MUST NOT lie. If an object tells me it's HTML-safe, it want to trust it.
  3. allow HTML-escaping templates to embed (and escape) HTML-unescaping templates
  4. hence the need for declaring a template as:
    • HTML-safe with HTML-unsafe input (escaping needed for double-mustache tags, HTML-safe output)
    • HTML-safe with HTML-safe input (escaping not needed for double-mustache tags, HTML-safe output)
    • HTML-unsafe with HTML-unsafe input (escaping not needed for double-mustache tags, HTML-unsafe output)

Your csh scripts belong to the "HTML-unsafe with HTML-unsafe input" realm. I want (point 3) them to be safely embeddable (escaped) (even via a {{>partial}} Mustache tag) into a HTML-safe template.

You have a correct analysis: we have to choose at which level the HTML-unescaping belongs. Until I change my mind, I believe it belongs to the compilation phase. Nodes of the syntax tree belong to a single realm.

Where to store the option?

More tomorrow.

groue commented 11 years ago

I mean: your idea of a singleton configuration class could surely do the trick. Granted 100%. This is a neat idea.

I just want to explore cleaner solutions before we dig into it right away.

If being clean shows any flaw, we'll look into the singleton configuration. If being clean is possible, it will be much better.

KosmicTask commented 11 years ago

No problem. I wont implement anything for now. I can live with {{{...}}} for my next release.

As is normal in these cases, because I am not the author, I only really see my usage case scenario not the bigger picture. So, my opinion might be worthless, however:

To me your rules make sense:

  1. Allow optional escape behaviour pragma tags in template files (do any pragmas exist as yet?).
  2. GRMustacheTemplate implements an escape behaviour API.
  3. GRMustacheTemplateRepository implements an escape behaviour API.

Rules at a lower level override rules at a higher level. Parent rule overrides child rule for partials.

So:

  1. No escaping behaviour can be achieved by operating solely at the level of GRMustacheTemplateRepository in the absence of any other directives.
  2. Individual templates can define their own behaviour.
  3. When a template marked as unsafe is rendered as a partial in a template marked as safe the parents safe rule is honoured.

The singleton approach is fine for a coarse grained solution. But operating at the repository level is probably sufficient and more flexible.

Enough for now. Time to put the cat out.

Thanks for your time on this.

groue commented 11 years ago

I'm glad this makes sense to you :-) I'll shed an eye on the code, to see if it agrees as well.

do any pragmas exist as yet?

No, there are no more. There used to be a {{%FILTERS}} pragma that was required before GRMustache would parse {{ f(x) }}. I was reluctant to extend the Mustache spec, back then.

KosmicTask commented 11 years ago

I was reluctant to extend the Mustache spec, back then.

The spec has been static for quite a while. Pragmas are common in most languages and generic so they could be a worthwhile addition to the spec rather than just a GRMustache extension.

groue commented 11 years ago

I may not be a good FOSS guy, because I could never convince anybody to embed any GRMustache extension :-)

This morning I've played with the source a little, and checked how it would support our plans. Bottom-up approach, quite a heavy patch actually. I'll try to do it top-down now, starting from a reasonable API (pragmas and template repository).

KosmicTask commented 11 years ago

It takes a particular sort of tenacity to be effective in the FOSS specification space. I certainly don't have it. An extension is fine with me.

Top down sounds like the way to do it. Let me know if I can assist.

groue commented 11 years ago

Hi back Jonathan!

Pragmas are implemented in the HTML-escape-top-down branch.

{{% RENDER:TEXT }} turns the template into a text template that does not escape its input. Those text templates are escaped if they get embedded in HTML templates.

{{% RENDER:HTML }} is there for consistency: it turn templates into HTML templates and triggers classic behavior.

There is a test suite at https://github.com/groue/GRMustache/blob/HTML-escape-top-down/src/tests/Public/v6.1/GRMustacheSuites6_1/text.json

In order to assert internal consistency, there are new error cases. Those get tested in https://github.com/groue/GRMustache/blob/HTML-escape-top-down/src/tests/Public/v6.1/GRMustacheTextTest.m

Would you mind having a look? Completing the picture by adding APIs on GRMustacheTemplateRepository should be a piece of cake after that.

KosmicTask commented 11 years ago

Hi Gwendal.

That was quick! I should be able to take a quick look in half an hour or so.

groue commented 11 years ago

Take your time: I need a rest :-)

KosmicTask commented 11 years ago

No Doubt. I have to finish some client work anyway now. So I will check it out and get back to you tomorrow. Looks promising.

KosmicTask commented 11 years ago

Changes look good. Quite intricate.

Tests seem comprehensive and sensible. All pass including 32bit and GC on OS X 10.8. Partial handling seems well thought out and consistent.

Found one typo in an error message. I pushed this to branch HTML-escape-top-down on my repo.

groue commented 11 years ago

Thanks for the typo. I've been tidying things up this morning, and documenting internals.

Even though new APIs on GRMustacheTemplateRepository are a neat idea, conceptually clean, kind-of convenient etc, I wonder about the verbosity of:

GRMustacheTemplateRepository *repo = [GRMustacheTemplateRepository templateRepositoryWithBundle:nil HTML:NO];
GRMustacheTemplate *template = [repo templateNamed:@"script"];
NSString *rendering = [template renderObject:data error:NULL]:

vs regular HTML rendering

GRMustacheTemplate *template = [GRMustacheTemplate templateFromResource:@"script" bundle:nil error:NULL];
NSString *rendering = [template renderObject:data error:NULL]:

or even

NSString *rendering = [GRMustacheTemplate renderObject:data fromResource:@"script" bundle:nil error:NULL];

Mustache the language is surely HTML-oriented. However it's not a reason for GRMustache users to pay an API price whenever they switch to text rendering. I mean: GRMustache will now fully support text aside HTML: I wish the API does not give any "second-class citizen" impression.

Since you let your users write their own templates, it's likely you'd rather not using the pragmas, and that you'll prefer switching to text output from your code. I'm very open to your suggestions here.

KosmicTask commented 11 years ago

Mustache the language is surely HTML-oriented. However it's not a reason for GRMustache users to pay an API price whenever they switch to text rendering. I mean: GRMustache will now fully support text aside HTML: I wish the API does not give any "second-class citizen" impression.

I agree. I think the issue is bigger than just the text rendering. You need a generic way of representing pragmas via the API. Okay, you only have one pragma at the moment but the idea is extensible. Sure, it's fine to have convenience methods for specific situations but I think the API needs to extend beyond that. Pragmas could be used to define other rendering options, filters or file encodings.

Since you let your users write their own templates, it's likely you'd rather not using the pragmas, and that you'll prefer switching to text output from your code.

Yes. Not using the pragmas would be ideal. Users can take a script, save it as a template and add whatever tags are required. Obviously they could add the pragma too but its an extra non obvious necessity.

More later.

KosmicTask commented 11 years ago

More thoughts. Perhaps not that insightful.

Looking at the existing api its noticeable that it doesn't really allow for this normal pattern:

// allocate
GRTemplate *t = [[GRTemplate alloc] initWithSomething:object];

// set properties and configure
t.configuration = [GRTemplateConfiguration textRenderingConfiguration];

// Go!
NSString *s = [t render:anotherObject];

The heavy use of class functions make it convenient and efficient but it makes it more difficult to integrate API changes.

Mustache the language is surely HTML-oriented. However it's not a reason for GRMustache users to pay an API price whenever they switch to text rendering. I mean: GRMustache will now fully support text aside HTML: I wish the API does not give any "second-class citizen" impression.

The API will have to change obviously. Maybe provide mode direct access to the template and repo models rather than adding more large signature methods.

NSString *rendering = [GRMustacheTemplate renderObject:data fromResource:@"script" bundle:nil error:NULL];

I would think of the above call not as core API usage but as an HTML based convenience. If I want to change text rendering behaviour then it seems acceptable to have to change convenience method or fall back to a lower level API and deal with the model objects more directly.

The pragmas are a means of configuring the render so perhaps we define A GRTemplateConfiguration class that can encapsulate all pragmas. Normally pragmas address the compiler but as the GRMustache compiler is private it makes sense to leave things at the template level. It also makes it clear that all configuration has to be capable of being expressed in the template itself.

At present you pass the HTMLsafe flag around the API. This could be refactored at some future point to pass a GRTemplateConfiguration instance.

A singleton (that again) GRTemplateConfiguration instance could be used to switch the default behaviour of all templates from HTML to TEXT. This would allow universal changes in rendering behaviour while keeping the existing convenience API. The singleton would affect every template in very repository so an API would need to be available to define the default template configuration to be used for a given repo.

That's it!

groue commented 11 years ago

The heavy use of class functions make it convenient and efficient but it makes it more difficult to integrate API changes.

True. I dislike mutable objects, and it starts to show. GRMustache is growing, and immutable patterns are now getting in the way.

Thanks for seeding me with this ideas.

groue commented 11 years ago

Hi Jonathan.

I've been trying to see where your suggestions lead. A GRMustacheConfiguration class has been introduced. [GRMustacheConfiguration defaultConfiguration] is the global one. Template repos can be given their specific configuration.

It's not tested yet so I do not yet guarantee that it works.

Meanwhile, would you mind looking at the inline documentation, see if you believe we're going in the right direction ?

GRMustacheConfiguration doc

GRMustacheTemplateRepository configuration doc

groue commented 11 years ago

BTW for consistency of pragmas with the API, I changed RENDER:TEXT to CONTENT_TYPE:TEXT.

Side notes:

I agree that a configuration class is future proof compared to a bunch of option bits.

I do not regret yet to have the public GRMustacheRenderingObject protocol expose a HTMLSafe boolean that does not match the configuration's contentType. Sure library users have two inconsistent ways to talk about HTML-safety. But I don't want to refactor GRMustacheRenderingObject yet: deprecating the only required method of the protocol would mean optional methods of major version change. Six versions in two years is already too much :-)

groue commented 11 years ago

Another side note (I'm sorry I'm using you to test some ideas, but it looks like you're sensible :-) )

The primary use case is clearly "avoid automatic HTML-escape". This is what I expect people to ask, and this is the topic that the future Guide will gravitate around.

As you see, the GRMustacheConfiguration class does not expose any "escapeBehaviour" property, but a "contentType" property instead. This is the only way I found to specify the mixing of HTML and text templates. I had to bend the primary use case into another formulation, for robustness' sake. One use case is not covered yet: render HTML and avoid escaping. I bet on the fact that nobody needs that.

groue commented 11 years ago

Tests: done.

KosmicTask commented 11 years ago

The primary use case is clearly "avoid automatic HTML-escape". This is what I expect people to ask, and this is the topic that the future Guide will gravitate around.

HTML escaping is clearly what current Mustache users expect to be the default. I presume this will remain the case using the current set of class methods? I would agree that a plain TEXT content type is conceptually the base rendering option.

As you see, the GRMustacheConfiguration class does not expose any "escapeBehaviour" property, but a "contentType" property instead. This is the only way I found to specify the mixing of HTML and text templates. I had to bend the primary use case into another formulation, for robustness' sake.

That sounds like a good choice and echoes HTTP's Content-Type. I would have, hopefully, done something similar.

One use case is not covered yet: render HTML and avoid escaping. I bet on the fact that nobody needs that. I agree. But if an urgent need arises the new design will be able to accommodate it I think.

I will check out the docs, code and tests and get back to you later today. Thanks for being so responsive.

groue commented 11 years ago

Thanks for being so responsive.

GRMustache is made of my idea of what Mustache can be, and the needs of the users :-) When a reasonable feature can not be implemented in userland, I'm happy to provide core support for it!

HTML escaping is clearly what current Mustache users expect to be the default. I presume this will remain the case using the current set of class methods? I would agree that a plain TEXT content type is conceptually the base rendering option.

Oops, my mistake. Yeah, HTML remains the default, and text has to be explicitly required.

I was talking about your use case, the one we discuss in this issue. And this use case is not about text vs. HTML, but "avoid automatic HTML-escape". It turns out it's a text vs. HTML story, after all :-)

I wait for your feedback before I think about high-level APIs, assuming there is a need for them.

Take your time. I mean, this is your feature - manage your time as you want :-)

groue commented 11 years ago

Mutable objects are indeed a nightmare :-)

The last commits make sure our affairs are more robust:

Repositories are initialized with a copy of the current default configuration at the time they are created: you can directly alter a repo config, without setting a new config:

// Build repo and configure it...
repo = [GRMustacheTemplateRepository ...];
repo.configuration.contentType = GRMustacheContentTypeText;

If the user mutates a configuration after he has built a template, he'll now get an exception:

// Load template...
id template = [repo templateNamed:@"document"];

// Now config is locked.
repo.configuration.contentType = GRMustacheContentTypeHTML; // exception
repo.configuration = [GRMustacheConfiguration defaultConfiguration];  // exception

This doesn't apply to defaultConfiguration: you can mutate it as much as you want (that's why repositories get a copy of the default configuration).

I hope it limits the exceptions to reasonably dirty practices.

Shall I say it again? Mutable objects are indeed a nightmare!

groue commented 11 years ago

The guide: https://github.com/groue/GRMustache/blob/HTML-escape-top-down/Guides/html_vs_text.md.

It is linked from the FAQ of the README under the "Is it possible to disable HTML escaping?" question.

groue commented 11 years ago

I was satisfied enough: GRMustache 6.2 is out, with support for text templates. Suggestions for API improvements are still welcome :-)

KosmicTask commented 11 years ago

Thanks Gwendal. Time was short yesterday. Am taking a look now.

groue commented 11 years ago

It's OK. You've given a great deal of help actually. I'm very happy with the result.

KosmicTask commented 11 years ago

I have updated KosmicTask WIP to use text templates and all looks well. All scripts render okay with default configuration set to TEXT and all tags annotated with {{ ... }}. Tests run okay in 32/64 bit and RC/GC.

At this early stage I don't see the need for an explicit GRMustacheTemplate content type API though I am sure that someone will engineer a requirement at some stage!

The guide: https://github.com/groue/GRMustache/blob/HTML-escape-top-down/Guides/html_vs_text.md.

The help doc is clean and explicit. The script samples give a good indication of a text content type usage scenario. The implementation was probably a bit more invasive than you had at first envisaged but I think you have opened up the model/API and that this will prove beneficial in the future.

Shall I say it again? Mutable objects are indeed a nightmare!

Mutability means, I guess, flexibility derived from complexity!

Thanks again for being so responsive and understanding. Exceptional.

groue commented 11 years ago

I'm quite happy we're both satisfied. You came up with a really sane feature request. For https://github.com/mustache/spec/issues/66, I had no difficulty finding other similar requests in other implementations. It was a matter of being the first to ask, and you made it :-)