Asset Handling - Githubissues

PebbleTemplates / pebble

Java Template Engine

https://pebbletemplates.io

BSD 3-Clause "New" or "Revised" License

1.1k stars 168 forks source link

Asset Handling #115

Closed thomashunziker closed 8 years ago

thomashunziker commented 9 years ago

In a typical web scenario you want to add external resources (e.g. CSS, JavaScript etc.) to the produced output.

For HTML5 all CSS resources should be placed within the 'head' tag. The JavaScript resources should be placed before the closing 'body' tag to improve the speed of the loading. Additionally those resources should be combined together into one file to reduce the number of HTTP requests to the server, because the browser is allowed only to open two connections per domain (see HTTP specification).

Hence whats happening is typically all resources are manually combined and added to the template. But this is hard with twig since you will add more resources in other child templates. E.g. there is a base.twig from which all other templates extends. From a practical standpoint of view it would be easier when each sub template and included template can add new resources and the base.twig does not need to know anything about the additional resources. This way only those resources are included in the HTML page which are actually required and not all potential resources which eventually are required.

We are currently implementing such a solution. Is there any interest that I create a pull request for this feature?

thomashunziker commented 8 years ago

I have implement a version which is cable of doing this kind of stuff. However the implementation is hard to achieve within the current setup of Pebble. Some portion of the template requires a second rendering phase. Meaning we render some portion of the template twice! This has to do with the fact that you can define certain assets after you include them. Hence we need a first phase in which we collect the assets and in the second phase we include the assets at the right location. If such a feature should be added to the core of Pebble we need to change the way we render the template. We need to introduce a second phase which allows to parse all the tags before we actually starting with the rendering of the template.

pacey commented 8 years ago

You should leave the bundling of resources to a tool like webpack or browserify? Pebble is a rendering engine, not a framework.

thomashunziker commented 8 years ago

@pacey you are right that Pebble is not a framework. However I see use cases where you want to collect the CSS and JS and produce one single tag at the end of the page (or for CSS in the head).

You can collect those things with other solutions. There I see two options:

You use some JS library which somehow collects your CSS and JS and produce a single include statement.
You post process on the server the HTML and collect the resources.

For both solutions I see some serious problems:

You need at least an additional HTTP call to fetch the JS files and CSS files. Since you need to first load a basic JS file which some post process your HTML output. Hence the loading can be slightly slower as when you already determine the JS files on the server side aggregate them.
When you post process the HTML on the server side you need to parse the HTML again and include the result into the HTML page. This far slower than simply use Pebble for this.

As I said I have implement the solution for us already and it seems to work quite well.

decebals commented 8 years ago

I had the same discussion with a friend of mine (@balamaci). My friend suggest me to use webpack or browserify. From what I read I understand that browserify depends on nodejs. I am very curious to see a basic demo application that integrates webpack/browserify with java. Another solution from my point of view is to use an web optimizer wrote in Java (for example https://github.com/wro4j/wro4j) but to be honest I prefer a javascript solution if this solution is easy to integrate.

Recently I released on github (https://github.com/decebals/matilda), one of my application that uses Pippo and Pebble. I made this move to try to find together with the Pippo's comunity some of best practices related to how to build an web application using this micro web framework. I think that it's good to improve the performance of the application with a such functionality.

@thomashunziker

As I said I have implement the solution for us already and it seems to work quite well.

Can you share with us some code?

@pacey Have you implemented a solution based on webpack or browserify? If yes, I am curious to listen your story :smile:

thomashunziker commented 8 years ago

I will share the code, but for this I need to be sure that it will go into the core. Since it is a bit of work to extract it from our system.

Actually what you really what is an example of how to use it. So I add below some example templates and the resulting output:

File: base.twig

<html>
    {% assetSection %}
        <head>
            <!-- Here we include the css files -->
            {% assetInclude css %}
        </head>
        <body>
            {% asset css 'first.css' %}
            {% asset js 'first.js' %}
            <p>some hmtl</p>
            {% include 'other.twig' %}
            {% asset css 'second.css' %}
            {% asset js 'second.js' %}

            <!-- Here we include the js files -->
            {% assetInclude js %}
        </body>
    {% endAssetSection %}
</html>

File: other.twig

<div>
    <h1>More Content</h1>
    {% asset css 'third.css' %}
    {% asset js 'third.js' %}
</div>

The output of the include statements can be controlled via Java. The example below would require some implementation which uses a CSS / JS compile which combine all files together. The implementation of the concrete output can be changed depending on the use case:

<html>
    <head>
        <link href="http://localhost/generated.css?c=first.css,second.css,third.css&hash=83dd34ab73c29" media="all" rel="stylesheet" />
    </head>
    <body>
        <p>some hmtl</p>
        <div>
            <h1>More Content</h1>
        </div>
        <script src="http://localhost/generated.js?c=first.js,second.js,third.js&hash=e39534ab23c21"></script>
    </body>
</html>

The only thing you need to write is a controller which can handle the aggregation of all the provided css and JS files.

The real advantage of this asset tags is that you can include your resource where you use them in the templates and you do not need to include them somewhere else.

pacey commented 8 years ago

So we use Gradle to build our application and have made a node proxy plugin that can call node scripts from the Gradle scripts. When we do a full build we attach the Webpack build into the processResources task I think, and then copy them into the build/resources directory so they are on the classpath when it gets archived into a jar.

In Webpack we create a Js and CSS bundle for each page, so that each page only has to make 1 request for the Js file and 1 request for the CSS file. We use the ES6 module definition of import, export etc. and Webpack can follow your dependency tree to build up bundled java script file (which is awesome).

If you use Gradle too I can share the plugin we wrote to call the node scripts, if it help you guys.

mbosecke commented 8 years ago

@thomashunziker, This sounds like a really interesting idea but I'm a little hesitant.

Because Pebble is an all-purpose template engine it can be used for generating any type of textual output, not just HTML. It can be used to generate CSV, JS, XML, SQL, etc., and because of that I'm always reluctant to add new features that are specific to HTML.

I would much rather see it implemented as an optional third-party extension, at least at first. I do, however, see why the requirement of a second rendering phase makes that impossible with the current state of Pebble. My preference would be to make the minimal amount of change to the core that will give third-party extensions the power to do this; perhaps give the extensions some sort of "pre-render" phase where they have access to the template and the user-provided data before it gets rendered. Do you think a "pre-render" phase would be enough for you to be able to implement your idea as an extension?

mbosecke commented 8 years ago

On second though, a "pre-render" phase would probably not suffice. The import tag supports dynamic expressions which won't be evaluated until the actual "render" phase so the extension wouldn't have access to the imported templates. I suppose it would have to be some sort of a "post-render" phase but what could we provide the extension without it having to resort to parsing the already-generated HTML? Hmm.

thomashunziker commented 8 years ago

In our implementation we trigger the rendering twice. The first time we use a 'NullWriter'. So actually we render the template twice. This works. However in theory this can be optimized when certain tags, filter etc. are aware of this pre-rendering phase, because those tags could skip certain stuff.

decebals commented 8 years ago

@pacey I don't use Gradle but from your description I see your solution a little bit complicated.

@mbosecke One or more extensions with stuff related to html sounds good for me. I use Pebble because I have not found anything better than Pebble to help me generate html pages.

@thomashunziker Your example looks good for a starting discussion. I prefer multiple assets (your assetSection) zones instead of one big.

Now my code looks like:

{% block headCss %}
    <link href="{{ webjarsAt('bootstrap/css/bootstrap.min.css') }}" rel="stylesheet">
    <link href="{{ webjarsAt('font-awesome/css/font-awesome.min.css') }}" rel="stylesheet">
    <link href="{{ webjarsAt('bootstrap-datepicker/css/bootstrap-datepicker.min.css') }}" rel="stylesheet">
    <link href="{{ publicAt('css/app.css') }}" rel="stylesheet">
{% endblock %}

where publicAt, webjarsAt are custom functions.

Sure I can add an asset as static block if I know the path to that resource. The new Pebble asset tag must accepts a function (for example publicAt) as parameter.

mbosecke commented 8 years ago

I'm just kind of thinking out loud here as a way of getting my thoughts down regarding @thomashunziker's original proposal:

Problem to solve

Allow INCLUDED templates to add content to designated sections of the original template.

Blocks are more limited because a template has to explicitly extend a parent template in order to override it's blocks plus multiple inheritance isn't supported. Also, the existing block system completely OVERRIDES a parent block whereas it would be nice to include an arbitrary amount of templates that each APPEND content to a particular section.

New tags

{% append 'name' %} which contains content to be appended to an existing block of the same name.

Naming is up for debate.

Implementation:

Phase 1 (prepare): Traverse the node tree invoking a new method on all nodes called "prepare". Prepare is just like the existing "render" method but it is not given a Writer object. Most nodes will not do anything in this method.
- include node: Evaluate it's expression to find out which template is being included, compile the included template, save the included template in the evaluation context (so that it can be reused during the next phase). Invoke the "prepare" phase on the included template at this time.
- extends node: Same as include node: evaluate the parent template, compile it, save it in the evaluation context for later, and invoke the "prepare" phase on the parent template.
- append node: Store a reference to this node in the evaluation context.
Phase 2 (render): Typical render phase which outputs to a Writer object.
- include node: Find the "included" template which is saved in the evaluation context and render it.
- extends node: do nothing.
- block node: Render this block normally and then find all the relevant append nodes that are saved in the evaluation context, render them now and append their output to this block.
- append node: do not render. They should only be rendered by block nodes.

Example

base.html:

<script src="http://localhost/generated.js?c=base.js{% block 'js' %}{% endblock %}&hash=e39534ab23c21"></script>
{% include 'module1.html' %}
{% include 'module2.html' %}

module1.html:

{% append 'js' %},module1.js{% endappend %}

module2.html:

{% append 'js' %},module2.js{% endappend %}

result:

<script src="http://localhost/generated.js?c=module1.js,module2.js&hash=e39534ab23c21"></script>

Considerations

It would be nice if the append tags didn't have to provide the comma delimiter in the above example. Maybe the delimiter can be specified by the original block somehow?

thomashunziker commented 8 years ago

The two phase model solves the issue with the rendering order. A second rendering is not required anymore. However do not expect to much out of it, because you need also to evaluate all control flow tags (such as if, for etc.) in phase 1. So you can skip certain tags, but most you need to evaluate normally.

We need at least two tags:

We need a tag to specify what to include / append.
We need a tag to produce the output.

The first one is easy. It only records what it finds within the first phase.

The second is a bit more complicated because it really depends on what you try to achieve. For CSS / JS you want to combine the recorded files into a single URL. Normally you need to attach a hash. In our case we even compress the URL string to reduce the length.

Therefore I recommend the ability to register a handler for the generation of the output. Our handler interface looks like:


/**
 * The asset handler is called to generate the output for the assets within a template.
 *
 * <p>
 * The asset handler eventually processes the given assets (e.g. minify them and combine them).
 *
 * @author Thomas Hunziker
 *
 */
public interface AssetHandler {

    /**
     * This method process the provided {@code assets} of the given {@code assetType}.
     *
     * <p>
     * The implementer may process them in a specific way. E.g. combine them, minify them, add a prefix to the path etc.
     *
     * <p>
     * A typical implementation will combine all the given resources together into a single URL. On this URL a dedicated
     * listener will provide those resources in a single file. This will improve the performance because the browser
     * eventually needs to make only one request per asset type.
     *
     * @param assetType
     *            the asset type of which the {@code assets} are.
     * @param assets
     *            the assets which should be handled.
     * @param writer
     *            the writer to which the output should be written to.
     * @throws IOException
     *             thrown when the output could not be written.
     */
    public void handle(AssetType assetType, List<IAsset> assets, Writer writer) throws IOException;

}

This way everyone can inject a different strategy to handle the assets.

Our implementation looks like:

public class SimpleAssetHandler implements IAssetHandler {

    @Override
    public void handle(AssetType assetType, List<IAsset> assets, Writer writer) throws IOException {
        final List<String> paths = assets.stream().map(a -> a.getPath()).distinct().collect(Collectors.toList());
        final String hash;
        final String generatorPart;
        if (assetType == AssetType.CSS) {
            hash = this.assetLoadService.getCssHash(paths);
            generatorPart = "assets/compressed.css";
        } else if (assetType == AssetType.JAVASCRIPT) {
            hash = this.assetLoadService.getJavaScriptHash(paths);
            generatorPart = "assets/compressed.js";
        } else {
            throw new RuntimeException("The asset type " + assetType + " is not processable.");
        }

        String url =
                ThreadContextHolder.getContext().buildUrl(
                        generatorPart + "?p=" + AssetUtil.encodeAssetPaths(paths) + "&h=" + hash);
        this.handleAssetType(assetType, url, writer);
    }

    private void handleAssetType(AssetType type, String path, Writer writer) throws IOException {
        if (type == AssetType.CSS) {
            writer.append("<link href=\"").append(path).append("\" rel=\"stylesheet\" />");
        } else if (type == AssetType.JAVASCRIPT) {
            writer.append("<script src=\"").append(path).append("\"></script>");
        } else {
            throw new RuntimeException("Unkown asset type '" + type + "'.");
        }

    }

}

Eventually we can convert the above concept into a more generic one which can also be used for other stuff than CSS / JS. Eventually in other use cases such as e-mail generation we face similar issues.

mbosecke commented 8 years ago

You're right about having to evaluate all control flow statements during the first phase, I didn't think of that. That's very worrisome, because I would need somewhere to store all the evaluated results so that all these nodes don't need to be re-evaluated during the second phase. This is a big red flag that this might be more effort than it's worth.

But as for the "Handler" interface, Pebble already has functionality that allows you to take an input and manipulate it to provide a custom output, which is by using a "filter". Here's how your AssetHandler would be implemented as a custom pebble filter:

parent.html:

{% filter asset('javascript') %}{% block 'js' %}base.js{% endblock %}{% endfilter %}
{% include 'module1.html' %}
{% include 'module2.html' %}

module1.html:

{% append 'js' %},module1.js{% endappend %}

module2.html:

{% append 'js' %},module2.js{% endappend %}

custom filter:

public AssetFilter implements Filter {

    @Override
    public List<String> getArgumentNames() {
        List<String> args = new ArrayList<>();
        args.add('type');
        return args;
    }

    @Override
    public Object apply(Object input, Map<String, Object> args){
        String[] assets = ((String) input).split(','); // {'base.js', 'module1.js', 'module2.js'}
        String assetType = args.get("type");

        // ... whatever processing you need to do to generate the path
        return "<script src=\"http://localhost/generated.js?c=" + path + "/>";
    }

}

result:

<script src="http://localhost/generated.js?c=base.js,module1.js,module2.js&hash=e39534ab23c21"/>

So I still think the append tag that I proposed would be enough create the functionality you are talking about and it's a generic enough to be included in a general purpose template engine.

I just need to think more about where to store all those evaluated results during the first rendering phase so that the relevant nodes don't have to be re-evaluated during the second phase.

thomashunziker commented 8 years ago

To avoid to execute to much twice I introduced this assetSection tag to limit the scope of the second rendering.

I see two options:

Store nothing (reevaluate all expressions again in the second phase). Even this sounds stupid its eventually not so bad.
Store the results somehow in the evaluation context. We could use a Map with the node as the key and the result as the value.

It's a trade-off between memory and CPU.

jknack commented 8 years ago

It sounds good, but in my opinion this is out of the scope of a template engine.

Still, for a 100% java solution (with nodejs as lib) see: https://github.com/eclipsesource/J2V8

I built an asset module on top of J2V8 for Jooby

My asset module works with the template engine of your choice and of course next release of Jooby comes with pebble: https://github.com/jooby-project/jooby/issues/247

mbosecke commented 8 years ago

I'm still a fan of the idea of being able to create content in one section of a template and having it "hoisted" to another part of the template but the idea of virtually doubling the render time and making such large architectural changes just doesn't seem worth it. I'm closing this for now.