Closed coreyweidenhammer closed 5 years ago
Hey @coreyweidenhammer Sorry I didn't respond earlier. Thanks for reporting this and the project to replicate it.
I don't know wat may be causing this to be honest. My first guess is that react-pdf caches assets in memory to avoid network request for the same item, but it has a limit, so the memory footprint should stop increasing eventually.
I'll try to find some time to look a it, but it's going to be hard to debug.
Related to https://github.com/foliojs/pdfkit/issues/258 ?
i also experience memory leaks on client rendering but only on some devices
It just happens on some client machines of different kind, Android, Mac etc. but I can't reproduce it myself.
tl;dr: Each render leaks PDFKit objects.
I've been running into this issue as well and put together a small reproduction gist: https://gist.github.com/n6g7/4d9d9af8eb5b943b61c89c1c6fa29a3b
When running this on my mac I get the data in this spreadsheet. Basically the the example above leaks about 250kB per render in memory.
I also used node-heapdump to inspect the objects in memory and I noticed many StandardFont
, Root
, PDFDocument
, PDFReference
, etc objects that aren't being garbage collected. I think these are pdfkit classes.
I don't really understand what's happenign with pdfkit, its last release was 2 years ago (v0.8.3) but there's been a lot of activity since then on the repo, which doesn't appear to have been released. I assume this is why react-pdf uses a fork (@react-pdf/pdfkit
).
Next step is to reproduce the leak using pdfkit only, could someone point to the source code of @react-pdf/pdfkit?
Thanks for your input and work on this! Sorry I cannot help much right now. I’m out of town until tomorrow. I’ll soon resume my work on this library. About the @react-pdf/pdfkit source code, you can find it here
@kations the problem is just yoga-layout-prebuilt
(dep of @react-pdf/renderer
) which is an asm.js build of yoga-layout
(it's not a memory leak). If you use electron or node, you can prefer to patch @react-pdf/renderer
in order to use yoga-layout
, then it will build a native node module instead. The TOTAL_MEMORY=134217728
limit is only for the asm.js build.
@Skywalker13 I get similar results with a patched version of @react-pdf/renderer
(using yoga-layout
instead of yoga-layout-prebuilt
).
Could you elaborate on the changes you've made?
I suppose that you use react-pdf in frontend (browser side) and not in server side (with babel-node like me). In this case you can continue to use the asm.js version.
Look at the package.json
of yoga-layout...
If I build for node by hand:
$ npm run build:node
> yoga-layout@1.9.3 build:node /home/schroeterm/tmp/node_modules/yoga-layout
> npm run copy-sources && autogypi && node-gyp configure build
> yoga-layout@1.9.3 copy-sources /home/schroeterm/tmp/node_modules/yoga-layout
> ! npm -s run is-monolithic || (rsync -r --checksum --delete ../yoga/ sources/yoga/)
make : on entre dans le répertoire « /home/schroeterm/tmp/node_modules/yoga-layout/build »
COPY Release/obj.target/nbind/geni/symbols.txt
CXX(target) Release/obj.target/nbind/sources/yoga/Utils.o
CXX(target) Release/obj.target/nbind/sources/yoga/YGEnums.o
CXX(target) Release/obj.target/nbind/sources/yoga/YGNode.o
CXX(target) Release/obj.target/nbind/sources/yoga/YGNodePrint.o
CXX(target) Release/obj.target/nbind/sources/yoga/Yoga.o
BLABLABLA...
SOLINK_MODULE(target) Release/obj.target/nbind.node
COPY Release/nbind.node
make : on quitte le répertoire « /home/schroeterm/tmp/node_modules/yoga-layout/build »
I have a node module for nbind which is used by yoga-layout
.
$ ls -la build/Release/
total 516
drwxr-xr-x 4 schroeterm schroeterm 4096 jan 31 07:08 .
drwxr-xr-x 3 schroeterm schroeterm 4096 jan 31 07:08 ..
drwxr-xr-x 3 schroeterm schroeterm 4096 jan 31 07:08 .deps
-rwxr-xr-x 1 schroeterm schroeterm 508360 jan 31 07:08 nbind.node
drwxr-xr-x 3 schroeterm schroeterm 4096 jan 31 07:08 obj.target
No more TOTAL_MEMORY
problem because it's a native build. Of course, it can be used only on server side.
And if you want to build for the browser, you must install an emscripten environment and change the TOTAL_MEMORY
to an higher value. For example:
diff --git a/javascript/final-flags.gypi b/javascript/final-flags.gypi
index dd13e827..d3e1ad26 100644
--- a/javascript/final-flags.gypi
+++ b/javascript/final-flags.gypi
@@ -19,7 +19,7 @@
"ldflags": [
"--memory-init-file", "0",
"-s", "PRECISE_F32=1",
- "-s", "TOTAL_MEMORY=134217728"
+ "-s", "TOTAL_MEMORY=268435456"
],
"xcode_settings": {
And try to run npm run build:browser
We’ve encountered these heap overflow errors as well. They happen reliably when we try to generate documents with a large number of pages (~70 pages), or when we render a large number of separate documents with a single page.
@diegomura It looks like pdfkit
has been updated with some fixes that might solve this memory leak problem: https://github.com/foliojs/pdfkit/issues/258#issuecomment-468976025. How much work is involved in pulling these fixes into @react-pdf/pdfkit
?
Have encountered this issue as well.
Worked around it by moving the PDF generation to an AWS lambda for now. @diegomura do you need help pulling the changes in https://github.com/foliojs/pdfkit/issues/258#issuecomment-468976025 to your own pdfkit
? Quite new here but happy to help if you think it could solve the issue here
Yes!! Help is much appreciated. Don't have much time to that yet @Tketa
@Tketa let me know if I can help!
@Tketa @diegomura @n6g7 I think this is critical to anyone using react-pdf
in production, and it needs to go forward quickly. How do you think we should proceed ?
@react-pdf/pdfkit
react-pdf
, and we use the gist from @n6g7 to test as wellHow does that sound to you ?
@maxaggedon Sounds good to me, I'm quite busy this week so can't commit to opening the PR and resolving conflicts but will review and test in my projects for sure.
Also sounds good to me. I'm also in the middle of something right now (react-pdf related also), so I'm not sure if I can commit to work on this right now. I agree with your plan and I promise I'll review code when needed
Hey @maxaggedon, agree with the plan but as mentioned I fixed the issue with the serverless solution so not a priority for me this week, can work on it this weekend however 👍
know everyone is on board ! I will try to start the work during the week.
PR is open 👆!
Thanks @maxaggedon !! You rock I'll try to find some time this week to fully test that PR
I've tried to pinpoint where exactly is @n6g7 gist leaking and made a couple of changes to leaktest.js
:
function go(n) {
for (let i = 1; i <= n; i++) {
if (i === 1 || i === 500) {
heapdump.writeSnapshot('out/' + Date.now() + '.heapsnapshot');
}
const element = React.createElement(Component);
// taken from pdf() function
const container = createInstance({ type: 'ROOT' });
const mountNode = PDFRenderer.createContainer(container);
// leaks occur because of updateContainer
PDFRenderer.updateContainer(element, mountNode, null);
// leak even more
// container.render();
...
}
}
~~We don't even need to render()
for leaks to occur, something in PDFRenderer.updateContainer
is maybe leaking. In that case we have a leaked Root
and Page
instances in heapdump.
But I don't now anything about ReactFiberReconciler
so could this be expected behaviour maybe?
@diegomura Any ideas?~~
[edit]
Looks like after using global.gc()
just before writing snapshot everything is fine.
@diegomura
After doing some additional research, it looks like the problem is in the line
this.layout.setMeasureFunc(this.measureImage.bind(this));
This is in both Image
and Text
constructor.
Because of that text and image instances are stored in layout
and can't be freed.
Somewhere in the code must be a call to this.layout.unsetMeasureFunc()
.
I think you were actually on the right path in this comment
Just to confirm that this is an issue I tried leaktest.js
with modified render()
of Text class:
async render() {
// rendering code
this.layout.unsetMeasureFunc();
}
For a page with wrap: false
this produced no leaks.
But for a page with wrap: true
there were leaks just as before because for the cloned Text intances unsetMeasureFunc was not called.
The question is where should measureFunc be unset?
Amazing progress, thanks @bhunjadi!!
I was able to reproduce your investigation: with the wrap: false
prop in <Page>
, the following does not leak:
diff --git a/src/elements/Text.js b/src/elements/Text.js
index 67db03e..c9f42a0 100644
--- a/src/elements/Text.js
+++ b/src/elements/Text.js
@@ -264,6 +264,7 @@ class Text extends Base {
}
this.root.instance.restore();
+ this.layout.unsetMeasureFunc();
}
}
Adding a similar call to the clone method also fixes it for wrap: false
:
diff --git a/src/elements/Text.js b/src/elements/Text.js
index 67db03e..5d17f83 100644
--- a/src/elements/Text.js
+++ b/src/elements/Text.js
@@ -216,6 +216,7 @@ class Text extends Base {
const text = super.clone();
text.layoutEngine = this.layoutEngine;
+ this.layout.unsetMeasureFunc();
// Save calculated layout for non-dynamic clone elements
if (this.blocks && !this.props.render) {
This looks like a very promising lead now. :)
@diegomura any ideas where would be the correct place to call unsetMeasureFunc
?
Thanks for this amazing research! Sorry I've been a bit unresponsive.
I guess a good place to call unsetMeasureFunc
is either before or after rendering, both should work since at that stage every elements knows where to position. However, because how page wrapping works (recursively clone elements while they are being placed in pages), it's highly probable that unsetMeasureFunc
should be called also somewhere else. But not really sure where yet without further analysis.
One of the main performance bottlenecks of this project is the page wrapping algorithm. As I said, in order to contemplate all cases, it clones and re-layout many times, which makes it very hard to debug and uses a lot of memory. I've been trying to think of a simpler solution but couldn't find one yet that supports all what we currently do. And there's no prior state of the art solution to this (that I could find).
I've added PR that looks like it is working.
Thanks @n6g7 for the clone
hint for wrapped pages. In the end what I used in PR is cleanup in removeChild
which seems to be working.
Thank you to everyone who has worked on this!
My team is still wrapping up testing all of our scenarios, but preliminary results from v1.5.6 appear to confirm that the memory leak is indeed resolved. I'll post another update when we are finished.
Thank you @coreyweidenhammer for testing! Please post any further details. Closing this issue 😄
~I'm on @react-pdf/renderer@^1.6.2
and having this problem it seems. My use case is exporting chat transcripts which can have a lot of messages, and after having exported around 700 PDFs, it stops doing anything (or panics with a heap out of memory error).~
~I tried using clinic
to profile, and it indeed seems memory just keeps ramping up. Am I missing something? 😄~
EDIT: Seems I'm having another problem, will open a new issue 😄
Anyone got around of building for Yoga for node
? We are running the latest v2 and we are facing everyday this issue.
@Skywalker13
Patches exists for nbind and node v12, but it's annoying because it's necessary to fork yoga-layout in order to use an other nbind. And even in this case it's annoying because it breaks with electron 8 and in my case, I need yoga with node 12 and electron 8.
I've tried yoga-wasm but it's broken with react-pdf: https://github.com/rickbutton/yoga-wasm/issues/4
And just now, I've discovered https://github.com/pinqy520/yoga-layout-wasm
I will try this one asap...
@Skywalker13 any luck on your try outs? We have no solution for this so far other than manually restarting the process :(
I tried a bit, it's not just a drop-in replacement. It's necessary to fork and adapt a little bit react-pdf (because the init is async with the wasm variants like yoga-layout-wasm), and maybe it doesn't work (like yoga-wasm), I don't know.
I've some success with react-pdf and yoga-layout built for some node versions but it's a bit annoying (for example https://github.com/Xcraft-Inc/yoga/commit/a0039ced2823892eef0f94a5b32a1809e9f4b34f).
IMHO It's definitively react-pdf which should use something else...
Before I get into the issue, I just wanted to thank @diegomura and others for your work and responsiveness on this wonderful library! Your efforts are greatly appreciated.
OS:
React-pdf version: 1.0.0-alpha.24, 1.0.0-alpha.25 Node v8.9.3
Description: We are consistently observing a memory leak, with the memory footprint of our node process growing linearly with each generated PDF until eventually exhausting the resources on our servers.
My team saw https://github.com/diegomura/react-pdf/issues/165 (now closed), but we are uncertain as to whether what we are experiencing is related or not. As noted above, we have tried using both alpha.24 and alpha.25 with the same results.
We first noticed the memory leak on servers running in AWS via ElasticBeanstalk (Amazon Linux) but have since been able to recreate it on our local development environments running macOS 10.13.6. We attempted isolating the source of the leak ourselves but have not yet come up with anything conclusive. We're not even sure if the memory leak is (as before) related to a dependency vs the react-pdf code itself, but we are hoping you might be able to identify something more readily.
How to replicate issue: We created a small project with limited dependencies that should allow you to recreate the issue as we see it. Zip attached here: pdfGenerator.zip
There are instructions in server.js, repeated below:
RUN $ yarn / npm install and then run: $ npm run start-with-chrome-inspect
An alternate path to viewing the memory leak at a higher level is to monitor via pm2: $ yarn / npm install pm2 $ pm2 start server.js $ pm2 monit -- proceed to access http://localhost:3000/pdf repeatedly -- and watch the memory footprint grow.
Note that the memory footprint grows in proportion to the size of the PDF generated, so if you want to see it grow/leak more readily, you can create a larger PDF with more content, images, etc.
Thank you in advance for your help!