assemble / assemble-core

The core assemble application with no presets or defaults. All configuration is left to the implementor.
MIT License
17 stars 2 forks source link

Increasing Render Times with Concurrency and RPS #15

Closed dtothefp closed 7 years ago

dtothefp commented 7 years ago

I'm trying to debug some load testing issues of increasing async render times with high concurrency and rqs. I'm not sure if this is purely assemble related or potentially related to nunjucks as well. When I render templates on Koa or Express load testing with increasing amounts of connections (and potentially rqs, but is seems when testing with wrk2 that number of connections is the bottleneck) I see increasing render times, whereas rendering purely with nunjucks I don't see these orders of magnitude increases.

For Example:

wrk -R 5000 -d 20s -t 1 -c 1

# nunjucks
(RENDER_DURATION=24)

# assemble with nunjucks
(RENDER_DURATION=6)

#### Assemble Wins --- :-))))
wrk -R 1000 -d 20s -t 1 -c 100

# nunjucks
(RENDER_DURATION=3)

# assemble with nunjucks
(ASSMBLE_RENDER_DURATION=406)

#### :-((((

It gets even worse increasing the threads -t option in wrk2

I'm not sure what this is related to, have messed around a bunch with ulimit and a little with Node memory but can't make rhyme or reason about it. Weird stuff happens that if I use assemble and also try to read files with async that process is prolonged as well. Not sure what is going on?

I setup this repo for testing with Koa and Express https://github.com/dtothefp/koa-express-load-testing

You can run tasks like:

npm run prod:koa #compile Koa with babel
npm run logs #tail the bunyan logs
npm run loadtest:assemble:high #use the loadtest module, wrk is better but this is ok

then try the same with just nunjucks

npm run loadtest:nunjucks:high
assemblebot commented 7 years ago

@dtothefp Thanks for the issue! If you're reporting a bug, please be sure to include:

jonschlinkert commented 7 years ago

My first reaction is that concurrency and rendering templates seems like a really bad idea. How would you guarantee the context is correct at any given time? Templates might be trying to render when the context either doesn't exist yet for a given template, or is wrong.

I'm not sure if this is purely assemble related or potentially related to nunjucks as well.

What have you learned so far that might help us in debugging? You're doing a lot of custom things in that setup, it might take a while for us to get familiar enough to get the answers you need.

Questions I have are related to how the timings are when:

jonschlinkert commented 7 years ago

for @doowb and I to be of help on this we'd need to have a setup that has the minimal code to reproduce the issue. Currently it's hard to say whether or not it's assemble or custom code. I'd like to help figure that out, but the learning curve would take a while as-is

jonschlinkert commented 7 years ago

however @doowb just mentioned that you're reading files from the fs on each request and re-rendering with a new instance of assemble.

A better approach would be to cache the template and compile it, then just use the compiled function to render the template with the new context

doowb commented 7 years ago

As @jonschlinkert just mentioned, you should probably either keep a single instance of assemble and load the files when the server is started so you don't have to read from the file system with each request.

Or, read the templates from the file system with fs.readFile so they're read in using an async method and don't block I/O.

dtothefp commented 7 years ago

creating an instance per request is definitely causing the weird behavior. I added a simplified setup with Express here https://github.com/dtothefp/koa-express-load-testing/blob/master/lib/templating-express/index.js

const setupTemplating = (fp) => {
  const start = duration();
  const app = assemble();
  const nunj = nunjucks.configure({
    watch: true,
    noCache: false
  });
  const templateFns = getGlobalData(config);

  Object.keys(templateFns).forEach(name => {
    nunj.addGlobal(name, templateFns[name]);
  });

  const templatePaths = addbase(srcDir, templateDir, fp || '**/*.html');

  app.engine('.html', consolidate.nunjucks);
  app.create('pages', {renameKey}).use(loader()); //create the `pages` type
  app.pages(templatePaths); //load the pages

  duration(start, 'TEMPLATE_SETUP');
  return app;
};

const as = setupTemplating();

app.get('/assemble', (req, res, next) => {
  const page = as.pages.getView('pages/index');
  const data = {
    type: 'Assemble',
    title: 'assemble',
    message: 'Hello Express'
  };
  const render = promisify(page.render, {ctx: page});
  const start = duration();

  render(data).then(({content: html}) => {
    res.status = 200;
    res.send(html);

    duration(start, 'ASSEMBLE_CACHED_RENDER');
  }).catch(err => {
    res.status = 503;
  });
});

app.get('/assemble-new', (req, res, next) => {
  const asInstance = setupTemplating('pages/index.html');
  const page = asInstance.pages.getView('pages/index');
  const data = {
    type: 'Assemble New Instance',
    title: 'assemble',
    message: 'Hello Express'
  };
  const render = promisify(page.render, {ctx: page});
  const start = duration();

  render(data).then(({content: html}) => {
    res.status = 200;
    res.send(html);

    duration(start, 'ASSEMBLE_NEW_RENDER');
  }).catch(err => {
    res.status = 503;
  });
});

You can have a look at the benchmarking https://github.com/dtothefp/koa-express-load-testing/blob/master/lib/templating-express/BENCHMARK.md

Render times and reading some arbitrary JSON with async in a middleware go way up when running concurrent connections, creating a new instance of assemble each time. Things look way better when using a single instance of assemble and just doing rendering in the request. That said, at high concurrency with only using a single instance of assemble the time for reading the JSON with async goes way up, which I'm confused why this would be happening?

doowb commented 7 years ago

EDIT: @dtothefp sorry, I was distracted and in a hurry when I wrote the last response. After re-reading it, it seemed kind of short. I've updated the response after looking through the code some more.

Things look way better when using a single instance of assemble and just doing rendering in the request.

I thought that would be the case since there's less file I/O happening for each request. If for some reason you need to dynamically load files, you could keep an object of already read filepaths as a cache so you know when something has been read. And if the file changes on the file system, remove the path from the cache so you can re-read it in on the next request.

high concurrency with only using a single instance of assemble the time for reading the JSON with async goes way up

I had to go back and look to see what you meant by this... I thought it was something to do with adding data to the assemble instance before rendering. But from looking at it, you're just reading in data and not doing anything with it, which does seem like it shouldn't slow down anything based on how assemble is being used.

One question though... how are the durations being used in the output in the benchmarks. Are those avg times for each key or is that just the last debug message for that key? If it's the last message, there could be something else that's just slowing down that specific request.

jonschlinkert commented 7 years ago

If it's getting progressively slower, that indicates a memory leak, which means an object is probably being built up some where. Something like:

Object.assign(foo, bar);

At render time, you need to make sure that you at least shallow clone, if not deep clone the context before passing it to render:

// make sure the first object is an empty object when shallow cloning
var ctx = Object.assign({}, foo, bar);

// or use a deep clone/merge
var merge = require('merge-deep');
var ctx = merge(foo, bar);
doowb commented 7 years ago

@dtothefp I edited my last response with more information after reading through some of the code and benchmarks.

dtothefp commented 7 years ago

@doowb yeh the logging isn't an average, just the last logs. I do see an increase in async io times when using Assemble, not sure what that's about? Anyway, this issue seems more related to how I'm using Assemble inappropriately so feel free to close.

Also, if I did cache templates in assemble.load and then needed to update the templates, would I just call assemble.load again on the updated paths?

doowb commented 7 years ago

would I just call assemble.load again on the updated paths?

Yes, you would just have to do app.pages(filepath) again.

jonschlinkert commented 7 years ago

thanks @dtothefp no worries