Closed getlarge closed 7 months ago
Here is the outcome of some high-level benchmarks for the PDF generation examples. I refactored the examples to isolate the following phases:
Content
from the JSON certificate runs into node vmPdfPrinter
and creating the pdfkit PDFDocument
Phase | Duration (ms) | Memory used (heap) (MB) |
---|---|---|
before | X | 114.41 |
generateContentInSandbox | 12.679 | 114.86 |
753.185 | 135 | |
store | 1294.031 | 125.32 |
after | 2060.447 | 125.34 |
Phase | Duration (ms) | Memory used (heap) (MB) |
---|---|---|
before | X | 116.73 |
generateContentInSandbox | 35.927 | 121.09 |
665.618 | 142.89 | |
store | 1207.106 | 133.04 |
after | 1909.240 | 133.05 |
It is quite surprising that most of the processing time is spent in the store
phase, when reading the pdfkit
PDFDocument stream.
I will generate some CPU profiles to check where (which modules and functions) precisely the time is spent.
The flame graph below results from a CPU profiling of the store
phase. It shows that most of the time is consumed by the function tinf_inflate_block_data, it comes from tiny-inflate
library which decompresses buffer during font processing (!).
I now think of the following potential improvement tracks:
The most straightforward track is "try another font". I did that, and the results are hard to believe!
Phase | Duration (ms) | Memory used (heap) (MB) |
---|---|---|
before | X | 121.57 |
generateContentInSandbox | 32.749 | 125.93 |
107.098 | 125.01 | |
store | 108.804 | 127.53 |
after | 249.143 | 127.53 |
Yes, you read correctly; the process is 7 times faster. To achieve this result, I simply replaced the current lato-font package by this one. I understand that lato-font is being replaced, but I encourage you to run this benchmark before making a choice.
I suspect performance could be even better when using optimal fonts with the WOFF2 format, but during the trial, I ran into a
RangeError [ERR_BUFFER_OUT_OF_BOUNDS]: Attempt to access memory outside buffer bounds
thrown by fontkit.
And this is the rendered PDF: generating.pdf
The fonts are not exactly the same weight as previously, but it is just a matter of picking the right ones from the @fontsource/lato
package.
Nice job with the benchmarking, remarkable how much of a difference changing the font made! After reading through this, I recommend that the related PR be merged first and that @christophbuehler runs the performance benchmarks against the proposed new font as the time saved is significant!
Nice job with the benchmarking, remarkable how much of a difference changing the font made! After reading through this, I recommend that the related PR be merged first and that @christophbuehler runs the performance benchmarks against the proposed new font as the time saved is significant!
Indeed the results with a new font are surprising. I was also not able to get a PDF/A-3 compliant document with that font because of some incorrect measurements. However, to keep things separated, we should merge #241 asap and update the performance monitoring (this branch) to take the most recent code changes (attachments / new font / PDF/A3) into consideration. Let's not mix responsibilities.
Description
From our usage of S1Seven's platform, we observed that producing PDF for an average certificate takes 2-3 seconds. Still, we have yet to determine where most of the time is spent and if the process can be improved.
I suggest running measurements in our example scripts and eventually adding some thresholds in our integration tests.