pugjs / pug-code-gen

https://github.com/pugjs/pug/tree/master/packages/pug-code-gen
7 stars 9 forks source link

use string concatenation instead of [].join #4

Closed alubbe closed 9 years ago

alubbe commented 9 years ago

Porting https://github.com/jadejs/jade/pull/1977

ForbesLindesay commented 9 years ago

Did you try benchmarking the option of doing:

buf[buf.length] = foo;

instead of

buf.push(foo);

I know that can often prove faster as it skips the method invocation. I wonder if it might turn out to actually be faster than string concatenation?

alubbe commented 9 years ago

yes I did and it made no difference. It is faster than .push when you know the length of the array before-hand because .push also bumps the length, but it was always slower than a = a + b

jeromew commented 9 years ago

I upgraded then-jade to 1.11.0 (no problem there) and started thinking again about then-jade and the upcoming jade 2.

@ForbesLindesay I think we agree that one of the goal of the new architecture is to make it possible and not too hard to specialize the jade graph for new usage (things like then-jade, react-jade)

the buf.push() pattern seems inevitable for then-jade since there is now way to intercept a s+= in javascript.

would you agree if jade-code-gen systematically calls a function that implements the concatenation strategy ? I guess It could slow down the compiler a little bit but

for example :

// current
this.buf.push('buf.push("' + str + '");');

// proposed by @alubbe 
this.buf.push('jade_html = jade_html + "' + str + '";');

// proposed by @jeromew
function jade_html_contat(s) {
    return 'jade_html = jade_html  + ' + s;
}
this.buf.push(jade_html_concat('"'+s+'";'));

This might need some work on the edges but you will get the idea.

The same could also be true for function calls. Many modifications in then-jade boil down to modifying function(..) by function*(..) or myfunc() by yield *myfunc()

a set of 'macro-like` methods for :

could simplify the derivation of jade-code-gen and ease the upgrade path for derivated libraries. I did not do the same research for react-jade to see how much specialized code we could avoid there.

Tell me what you think

ps: if you look at https://github.com/jadejs/then-jade/blob/master/lib/compiler.js, you can read the derivation comments for each of the 3 derived functions in the comment block of each function. You will see that the derivation of then-jade would currently be a no brainer if we had generating functions for function definition and function invocation.

alubbe commented 9 years ago

My initial idea to reconciliate our concerns was to pass the compiler a flag to toggle += vs Array.push, with the default being +=. It's not very flexible, but all of the code needed for it could live in this repo and could be tested here.

I like your idea better but at the same time this means that you are extracting knowledge of everything else around string concat, function definition and function invocation into the consumer - meaning anything could break with every release. You would also need a second function/flag to highjack the global variables and specifically, the initialization code of jade_html = "" vs jade_buf = [] (or something completely new).

Effectively, I see your suggestion implying that this repo would allow overriding default (internal) functions that side-effect each other. So let's say you as a consumer of this repo want to pass a different function. Chances are, you'll need to have a very close look at whether your stuff collides with the other 'default' functions.

Just to reiterate, I'm more for than against it, but I see the potential for a not-so-clean interface coming out of it. Maybe you could start on putting together a solution for string concat and we'll take it from there?

alubbe commented 9 years ago

Upon further thinking, now that jade is modularized, maybe another solution would be for you to fork this repo in its entirety. It's basically one file with <800 lines that you can control and adapt tests for. Also, whenever you merge upstream, you will see immediately if the other 'default' functions break your implementation either through merge conflicts or failing tests.

jeromew commented 9 years ago

I have a hard time pondering the pros and cons of forking jade-code-gen versus having a more modular code generator.

on the 800 lines, the current modifications to obtain then-jade is around 10 lines of trivial modifications that are of the type 'function' => 'function' or 'myfunc()' => 'yield myfunc()', basically introducing

this.gen_function = 'function';
this.gen_before_function_call = '';

is sufficient to have a trivial inheritance of the whole thing without any duplication of code.

my first example was bad because it indeed involved a hidden 'jade_html' variable and needed another method for the initialization of jade_html

I am more into a 'formal' operators :

"a + b" => gen_plus("a","b")
"function(..){..}" => gen_function("(...){...}")
"myfunc(...)" => gen_before_function_call("myfunc(...)")

This is a bit like meta-meta-programming so it of course feels like a bad thing / over optimization but we could also think of it in terms of jade-code-gen that should output an AST of the generated code instead of the code itself.

I am biased because I use then-jade and would like it to become a first class citizen in order to keep it synced with upstream. Your point about tests is correct even tough then-jade already catches problems by executing all the jade tests.

For function definition and function calls, it would be easy to put up some sed rules to automatically transform jade-code-gen code in the then-jade version ; not so easy for the '+'.

The reason why the generator of then-jade and jade are currently so close is that the generated code basically do the same thing : push strings into an accumulator.

It may have value for then-jade to have its own implementation of jade-code-gen and this level of customization (forking jade-code-gen) is maybe the correct extension point.

The [].join() optimization increases the fork footprint and upstream dependency management overhead ; this is probably why i try to see if it makes sense for jade-code-gen to have code generation primitives.

alubbe commented 9 years ago

With you saying your modifications are 10 lines of trivial changes, it starts to sound a lot like a fork would make sense and would be easy to maintain - but I'm with you, having the core be more flexible is also appealing.

Can we have some more opinions? ;)

ForbesLindesay commented 9 years ago

I think what I'd like to do is experiment with pushing this module to be more modular (e.g. could we push the attrs code into a separate module?). That way hopefully we could get this module to be way smaller than 800 lines. We could then easily fork this module in its entirety. Updates of this module would then have to be manually added/integrated into then-jade (but that's already the case with react-jade, which completely forks the code-gen stage anyway). It could, however, have a loose (i.e. ^1.0.0 style) version specifier for jade-lex, jade-parse, jade-load, jade-link, jade-filters etc. This would allow a lot of changes/syntax sugar and bug fixes to be added without needing to update the compiler.

alubbe commented 9 years ago

So if I understand you correctly, your course of action would be to modularize further. What do we do until we are at that point? My suggestion would be for @jeromew to fork this repo and change the 10 lines he needs. Hopefully, this will be a temporary thing and further modularization is just around the corner ;) With that, I believe it would make sense to merge the PR for now and then later add a layer on top of it to replace it with whatever way of building up the HTML that we want to use/inject.

TimothyGu commented 9 years ago

I am inclined to merge this. I'd agree that then-jade should fork this module, because that is the only different thing that sets jade and then-jade apart (or is it?).

alubbe commented 9 years ago

Any further input on this PR or is it ready to go?

TimothyGu commented 9 years ago

Looks good to me.

TimothyGu commented 9 years ago

Also regarding the attrs function modularization, see #6 which is pending a new release of jade-attrs.