Compile Mustache -> ColdFusion

markmandel commented 12 years ago

Starting up an issue, so we can discuss implementation, as I've been thinking about this a lot.

The issue with converting mustache is that what sections and variables do at runtime differ depending on what the variable is at runtime.

So the way I see this working is that each {{#section}} would need to be converted to a function (probably one that allowed for output to make life super easy). That way you could do type checking at run time to determine what the type of the object is, and then react accordingly.

You would need to store the inner parts of the original mustache template, so it can be passed to lamda's as well.

Anyone see any major issues with that sort of implementation?

rip747 commented 12 years ago

forgive me for being stupid, but i'm trying to wrap my head around all of this. what i'm trying to understand is what is the primary goal? is it to cache the compilation of the templates?

markmandel commented 12 years ago

Reasons:

Should be faster
Shouldn't have to be regexing strings on every request
Shouldn't have to be doing string replacement on every request (CF can do that for us already).
Shouldn't need a rendering engine on top of a rendering engine.

dswitzer commented 12 years ago

@rip747

The goal (as I see it) would be to see if we can get better performance out of the template by compiling the template done to something that can be either interpreted faster or as @markmandel indicated could be converted to something executable.

While the current code runs pretty well, I'm not sure it's well suited for when you have to run the same template in a batch that might include tens of thousands of executions.

Since ColdFusion isn't great and dynamically building and executing code, I'm wondering if it would make more sense to compile a template to some kind of token array that we could process quickly.

I'm thinking something that would look something like:

[
  {
    fn // reference to a function to run
    args: // arguments to pass to the function
  }
]

We could then just loop over the array and run each function passing in the arguments needed. The functions specified would either be the lamba or internal mustache functions designed to handle the logic (some of the functions would just be returning the text as-is.) The goal would be to great something that can be looped through and processed quickly.

I'm just wondering if we'll be able to squeeze better performance out of compilation in most use cases. My guess is the use cases of a ColdFusion-based Mustache implementation are greatly different than those looking for Java or JavaScript implementations (which are more likely to executed many times during the course of the code.) I suspect most CF-based uses are for simpler needs (other than maybe as used for e-mail templates--which would be executed many times in a mail merge type operation.)

markmandel commented 12 years ago

The only question I have is:

Why should we do all this work ourselves, when we could convert it to CFML and have the CFML engine do exactly this work for us?

(Also - as of next week, we'll be using Mustache as the main rendering language on our site, so the closer to the metal the template language is, the better, IMHO)

dswitzer commented 12 years ago

@markmandel

I have several concerns (that may end up being invalid) about converting to CFML:

The overhead of compiling. In my (admittedly limited) experience compiling CFML on the fly for execution is there's overhead. The point of compiling should be to gain performance benefits. To avoid cached templates issues, we'd probably need to end up creating CFML files w/a hash of the template string--which has some overhead in itself.
Evaluation issues. As you mentioned, we'd probably need to end up generating a CFC that gets initiated to protect from variable/namespace collisions (rather than trying to the dynamic template.)
Managing compiled CFML. We'd need to do something to manage the cache of files we create. IMO, the benefit of compiling would be the ability to re-use the compiled template, which means we need to do something to clean up the unused template cache. We could delete the template after use, but then that means we're losing the benefit of re-using the compiled template.
I'd eventually like to implement a Mustache factory which would load as a singleton and pre-cache all our templates for use (we're going to use Mustache for all our various e-mail notifications, which is a big part of our application.) So I ideally want a compilation strategy that works will with workflow. So, to pre-compile in a singleton, we'd definitely need to keep the compiled templates on the filesystem, which just means # 3 (deleting old templates) becomes important.

This is where closures would potentially rock--being able to generate code on the fly and and store it in memory.

Anyway, I'm sure this issues can be worked through.

The main thing to me is are we really gaining performance benefits for the majority of use cases by compiling? I guess we can make compilation an option if it's slower on single runs, with the ability to specifying the results of the compile() method in place of a template.

We probably ought to mock up some code to see what we're really looking at.

What do you guys think?

markmandel commented 12 years ago

I have to respectfully disagree with all of your points Dan,

Re: The overhead of compiling: That is an overhead you experience once, and then the file is stored on the file system, with a name that is a hash of the mustache string. Once it has been compiled you never have to have the overhead ever again. On a production system, the file is stored in the CFML engine template cache, so overhead is the same overhead as almost any CFML template.

Oh, and 9/10 times, I'm sure you'll generate this file on your dev/stage server, and then push it up to your production system. So the production system wouldn't even see the compilation overhead.

Hashing strings is not particularly slow, but if you are concerned about it, let's cache the hashing against the string key. Very little overhead that way. Far less memory overhead than caching entire AST's in memory, and far les CPU than looping around them over and over for every invocation.

Also far less overhead than managing that cache of ASTs, especially when the CFML engine already does this for us - with no extra work on our end, and far less risk of memory leaks (which I've already run into).

Re: Evaluation issues:
Do the cfinclude inside a cfmodule call. Variable safe, and very fast.

Re: Managing compiled CFML:
Why do we need to have something to manage the cache of files, if they are hashed against the mustache string (i.e. are unique)? Disk space is cheap, and they are very small files. If you are at all concerned, delete them all on occasion from your dev/stage server, and make sure your prod server syncs up with that particular versions incarnation. rsync -r --delete solves this issue.

Re: I'd eventually like to implement a Mustache factory which would load as a singleton and pre-cache all our templates \
Again, why re-implement the CF engine template cache? It already exists, and it's proven. Why re-invent the wheel?

Re: This is where closures would potentially rock:
Not following you on this one. I don't see how these help. You would still need to write the code to somewhere before executing it. Unless you start caching a bunch of ASTs.. and then we get back to "why would you re-invent the CFML engine template cache?"

To be honest, I can't see how a layer on top of CFML could ever be faster than straight CFML? It doesn't make any sense to me.

dswitzer commented 12 years ago

@markmandel

Like I said, my concerns may be invalid. Just listing them for context and conversation purposes.

I never said that running straight CFML was faster than running something interpreted. My concern was with the compiling into CFML and then invocation for single run templates, that might end up being slower.

We also plan on allowing our customers to leverage Mustache for customizing their outbound messages--which is why I'm thinking about management of compiled code. We have the potential for massive amounts of templates, some of which will rarely (or potentially never) be called.

I just like the way the compilation stuff works in many of the JS libraries. The compilation ends creating a closure, which can be passed around in memory and run. I like the idea of being able to compile the template when you need the evaluation benefit, but the compile code doesn't have to be managed on disk.

For us, something like this is going to be a fairly common use case:

var template = Mustache.compile(template);

for( var i=0; i < 10000; i++ ){
  Mustache.render(template, context[i]);
}

The problem is the "template" may only ever be used the one time (or very infrequently.) Keeping it around on disk makes no sense. If the compiled template is purely in memory, there's nothing else to manage.

Anyway, we could always add methods for removing a compiled template from disk. Just giving some context in to my line of thinking.

Also, if we compile the template to CFML, wouldn't it be most efficient to just generate straight procedural code, where sections/partials are just conditional blocks that have been fully compiled?

We'd just have to be aware of caching issues due to partials that were changed, but it would be manageable.

rip747 commented 12 years ago

if you use closures then Mustache will require CF9. Currently Mustache is targeted at CF8.01. I would like it to remain at CF8.01 if at all possible since some of my project that are on CF8.01 are still using it.

On Mon, Apr 9, 2012 at 9:17 AM, Dan G. Switzer, II < reply@reply.github.com

wrote:

@markmandel

Like I said, my concerns may be invalid. Just listing them for context and conversation purposes.

I never said that running straight CFML was faster than running something interpreted. My concern was with the compiling into CFML and then invocation for single run templates, that might end up being slower.

We also plan on allowing our customers to leverage Mustache for customizing their outbound messages--which is why I'm thinking about management of compiled code. We have the potential for massive amounts of templates, some of which will rarely (or potentially never) be called.

I just like the way the compilation stuff works in many of the JS libraries. The compilation ends creating a closure, which can be passed around in memory and run. I like the idea of being able to compile the template when you need the evaluation benefit, but the compile code doesn't have to be managed on disk.

For us, something like this is going to be a fairly common use case:
var template = Mustache.compile(template);

for( var i=0; i < 10000; i++ ){
 Mustache.render(template, context[i]);
}
The problem is the "template" may only ever be used the one time (or very infrequently.) Keeping it around on disk makes no sense. If the compiled template is purely in memory, there's nothing else to manage.

Anyway, we could always add methods for removing a compiled template from disk. Just giving some context in to my line of thinking.

Also, if we compile the template to CFML, wouldn't it be most efficient to just generate straight procedural code, where sections/partials are just conditional blocks that have been fully compiled?

We'd just have to be aware of caching issues due to partials that were changed, but it would be manageable.

Reply to this email directly or view it on GitHub: https://github.com/rip747/Mustache.cfc/issues/7#issuecomment-5024924

dswitzer commented 12 years ago

@rip747

We're still using CF8.01 as well. I should have left that comment off, it was more of a "I wish CF had introduced closures long before CF10" statement, but that's certainly not clear from my message. It was definitely not an intention of mine to suggest we target CF10.

rip747 commented 12 years ago

all is cool. i even screwed up by stating that closures are in ACF9 instead of ACF10 :P

markmandel commented 12 years ago

Re: * The problem is the "template" may only ever be used the one time (or very infrequently.) Keeping it around on disk makes no sense. If the compiled template is purely in memory, there's nothing else to manage. *

Actually, you very much have to manage how many templates are in memory. If you are letting people throw in templates as they see fit, you will need to manage what templates are stored in memory, and for how long. This is a big concern for memory leaks.

Even for what you are talking about (1 template, thousands of iterations), you're talking about an extra overhead of 2 file I/O operations. (1) write to disk (2) read from disk. I'd take that over implementing my own memory management solution. Once the file has been read in by the CFML engine template cache, the overhead at that point is minimal, as it's the CFML template cache. I can't see how that would be slower.

The only time I can think it may well be slower, is if it was single use, single iteration - in that case, I would say - use the Virtual Disk to compile the template, rather than the physical disk. (Unless you're on 8.01). Is this something you would have a use case for? (It's very far removed from mine - we're pretty much replacing CFML templates with mustache ones).

I see your point re: * We have the potential for massive amounts of templates * - but I would still err on the side of cleaning up the disk, which is much harder to fill up, than RAM, which is much easier. Failing that - compile your mustache template to the Virtual Disk, and everyone should be happy at that point :)

Re: Also, if we compile the template to CFML, wouldn't it be most efficient to just generate straight procedural code, where sections/partials are just conditional blocks that have been fully compiled? I assume this refers to my original point above, where I said: "So the way I see this working is that each {{#section}} would need to be converted to a function...".

I can't see how you could generate this compiled cfml into straight procedural code. You can't replace a {{#section}} with a or a or a - because it can change at run time, depending on the context. Hence the idea of converting each section into a function, that can then (probably with some helper function's help) do the appropriate thing depending on the data coming in. It's the only way I can see it working (at least in my head). I'm open to other design ideas though.

dswitzer commented 12 years ago

@markmandel:

On 4/9/2012 6:56 PM, Mark Mandel wrote:

Re: * The problem is the "template" may only ever be used the one time (or very infrequently.) Keeping it around on disk makes no sense. If the compiled template is purely in memory, there's nothing else to manage. *

Actually, you very much have to manage how many templates are in memory. If you are letting people throw in templates as they see fit, you will need to manage what templates are stored in memory, and for how long. This is a big concern for memory leaks.

When I said keeping it in memory, I wasn't talking about storing as a singleton or storing in a persistent scope. I was referring to a compiled template that only lived the life of the current request--where GC should clean it up.

Even for what you are talking about (1 template, thousands of iterations), you're talking about an extra overhead of 2 file I/O operations. (1) write to disk (2) read from disk. I'd take that over implementing my own memory management solution. Once the file has been read in by the CFML engine template cache, the overhead at that point is minimal, as it's the CFML template cache. I can't see how that would be slower.

The only time I can think it may well be slower, is if it was single use, single iteration - in that case, I would say - use the Virtual Disk to compile the template, rather than the physical disk. (Unless you're on 8.01). Is this something you would have a use case for? (It's very far removed from mine - we're pretty much replacing CFML templates with mustache ones).

I wasn't trying to imply that all compiled templates should be stored in some kind of persistent scope. Ideally I'd have a factory where my core templates are persisted (which consist of a very known, limited number of templates) and then rest are compiled as needed.

However, it's all a moot point if the goal is to compile to CFML as we have no way to native render to memory in CF8.01 (which is definitely a requirement of mine.)

For me, my use case is often closer to single use, single iteration. While it's likely the template is used more than once in it's lifetime, there are many use cases were it's used so infrequently (with the potential of it never being used again) that I'd rather handle it as a single use/iteration.

These templates will be mainly pretty simple and from my testing generally already run < 15ms, so for me these type of operations don't gain from really compilation.

Maybe the best option for now is to have the compile() compile to CFML and return a struct w/information on the compiled template.

The render() method would look at the "template" argument and if it's a struct, would then try to execute the compiled template.

NOTE: I'd have the compile() return a complex variable, that way it doesn't leave potential security holes. If we just returned a string path, then that would leave a user-defined template open to exploits.

I see your point re: * We have the potential for massive amounts of templates * - but I would still err on the side of cleaning up the disk, which is much harder to fill up, than RAM, which is much easier. Failing that - compile your mustache template to the Virtual Disk, and everyone should be happy at that point :)

The project I'm working on is for CF8.01, so the virtual disk is not an option.

I'm almost wondering if it makes sense to support both a compiled method and the current interrupted code. That way you can utilize compiled templates if it makes sense, but don't need to compile all your templates.

For me, it would make sense to be able to pre-compile some of my templates--from which I have a handful of core templates that will be re-used over and over.

However, I also have a bunch of templates that may only be used once or rarely used. For those templates, keeping a persistent cache on disk doesn't make a lot of sense. We could certainly build management into the CFC, but it certainly adds a lot of complexity.

All I'm saying is I think there are times when I don't think it makes sense making a compile template persistent on disk. It works very well for some use cases, but not for all.

Re: Also, if we compile the template to CFML, wouldn't it be most efficient to just generate straight procedural code, where sections/partials are just conditional blocks that have been fully compiled? I assume this refers to my original point above, where I said: "So the way I see this working is that each {{#section}} would need to be converted to a function...".

I can't see how you could generate this compiled cfml into straight procedural code. You can't replace a {{#section}} with a or a or a - because it can change at run time, depending on the context. Hence the idea of converting each section into a function, that can then (probably with some helper function's help) do the appropriate thing depending on the data coming in. It's the only way I can see it working (at least in my head). I'm open to other design ideas though.

I was imagining the compiler might end up producing something like for sections:

Maybe helper functions would be better, I was just thinking about trying to make the compiler produce code that was the most linear as possible--which I assumed might be the most efficient. I don't really care how it's ultimately done, just thinking that we should have the results produce the fastest code we can (since the ultimate goal is speed.) This was just an idea.

pmcelhaney commented 12 years ago

I agree with Dan. Ideally, the code should be rewritten as an actual parser. The performance of one iteration would be about the same, perhaps a little faster, and iterating over multiple contexts with the same template would be substantially faster.

The API shouldn't change -- you'll know your done when the existing unit tests pass. Although it would probably be a good idea to add a compile() method so you can cache the AST externally in a way that makes sense for your app.

I've never written a parser before. It's on my bucket list. :)

rip747 / Mustache.cfc

Compile Mustache -> ColdFusion #7