Closed kimmobrunfeldt closed 5 years ago
You might be interested in this pooling solution I made in a fork of your project. It uses the generic-pool
package and has already been tested with thousands of PDF's (we are using it quite heavily). It doesn't use tab pooling since it was very unstable when I tried it, but this browser pooling solution already offers much better performance.
Thanks for your comment @SpaceK33z.
Based on that, I implemented pool support at browserless
One thing, are you defined 4 and 8 as size of your pool for something?
In addition, I added a limitation of 50 uses per browser: https://github.com/Kikobeats/browserless/pull/15/files#diff-6499deccd988c3ae5b5ba9a58e719d5aR8
Cool!
One thing, are you defined 4 and 8 as size of your pool for something?
Those max number is more or less based on the hosting provider they run in, to not use too much memory / cpu. The min number means that at minimum 4 browsers always are running, which can improve the performance compared to e.g. 3 if you get 4 requests at the same time (that would mean one browser would still have to start).
@SpaceK33z in my case I'm deploying to AWS Lambda.
Are you using any script for benchmarking what values work better?
I made this very basic script back when I was trying to use tab pooling instead of browser pooling (which caused a weird memory leak): https://github.com/Volst/url-to-pdf-api/blob/master/loadtest.js
It tries to load a test invoice. You can use the script like this: node loadtest.js 10
(10 being the amount of requests). All the requests will happen in parallel. It would be nice to improve the script to set the concurrency you want to use (e.g. 100 requests, max. 5 at the time).
I'm closing this for now because the solution without pooling seems to work quite well for many use cases. My hypothesis is that it wouldn't make a huge difference, but that can be only verified by testing.
Workaround for this is to simply scale new servers (or Heroku dynos) horizontally. RAM usage will probably be the bottleneck of scaling.
Each requests starts a new instance with Puppeteer. We should use a pool of e.g. 4 tabs to make rendering as a service more reliable.
See: https://github.com/GoogleChrome/puppeteer/issues/518