Closed rdpoor closed 7 years ago
Additional info: running the command line version of the node app:
$ p5serial
... displays identical behavior: bloating and eventually crashing. So maybe I'm reporting this bug in the wrong place?
Would love a bit more information. What platform and version? Are you running a p5.js sketch or have you just been seeing starting with the p5.serialcontrol and/or commandline version?
A code example would be great.
Hi @vanevery:
Mostly complete code example (including "how to replicate") here:
https://gist.github.com/rdpoor/42829a3d224a151172e1b9668468bfbb
As for versions:
Ok.. Thanks for writing all of that up. I'll try it out in the next day or two.
I have noticed that the chrome console generally can't handle a great deal of data and often ends up crashing or becoming unresponsive. I know you said that the "node" process is the one with the issue but it is it possible that it is chrome?
Can you take out the console.log(serial.readLine()); and just do a serial.readLine(); in your sketch just to make sure?
To be clear, I was watching node
with ps gux
so I know it was node and not Chrome that was bloating (unless Chrome changed its name! ;).
But this is frustrating. Now when I run p5serial, I'm getting:
$ p5serial
p5.serialserver is running
Error: Error: Device not configured, cannot open /dev/cu.usbmodem1411
at Error (native)
I can't think of any changes I made since testing the gist, and I don't know any other processes that are holding the USB port open. I can run my simple USB monitor to verify that there's data streaming in via USB.
I have to take care of a few other things and will get back to this when I can. Let me know if you have any thoughts in the meantime! Thanks...
Ok, probably is something with p5.serialserver.. I'll take a look.
Nah - it was brain bubble. When I restarted everything, the device is now on cu.usbmodem118. Standby...
Got your smoking gun right here (with the console.log() removed):
[~]$ date ; ps guxw 16710
Wed Aug 2 08:03:12 PDT 2017
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
r 16710 0.0 0.1 3060492 24096 s000 S+ 8:02AM 0:00.13 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
[~]$ date ; ps guxw 16710
Wed Aug 2 08:03:42 PDT 2017
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
r 16710 103.3 1.0 3232248 168580 s000 R+ 8:02AM 0:18.99 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
[~]$ date ; ps guxw 16710
Wed Aug 2 08:04:25 PDT 2017
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
r 16710 101.0 2.7 3508852 448028 s000 R+ 8:02AM 1:03.16 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
[~]$ date ; ps guxw 16710
Wed Aug 2 08:05:21 PDT 2017
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
r 16710 100.4 4.8 3874396 813372 s000 R+ 8:02AM 1:59.54 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
[~]$ date ; ps guxw 16710
Wed Aug 2 08:05:53 PDT 2017
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
r 16710 100.0 6.3 4120312 1060592 s000 R+ 8:02AM 2:32.73 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
[~]$
... after which p5serial
unceremoniously died:
$ p5serial
p5.serialserver is running
<--- Last few GCs --->
268833 ms: Mark-sweep 1346.7 (1396.1) -> 1343.0 (1401.1) MB, 699.6 / 0 ms (+ 51.9 ms in 95 steps since start of marking, biggest step 3.8 ms) [allocation failure] [GC in old space requested].
269769 ms: Mark-sweep 1343.0 (1401.1) -> 1342.8 (1403.1) MB, 935.8 / 0 ms [allocation failure] [GC in old space requested].
270697 ms: Mark-sweep 1342.8 (1403.1) -> 1342.7 (1403.1) MB, 928.5 / 0 ms [last resort gc].
271647 ms: Mark-sweep 1342.7 (1403.1) -> 1342.7 (1403.1) MB, 950.1 / 0 ms [last resort gc].
<--- JS stacktrace --->
==== JS stack trace =========================================
Security context: 0x2ead25fc9e59 <JS Object>
2: send [/Users/r/.nvm/versions/node/v6.3.0/lib/node_modules/p5.serialserver/node_modules/ws/lib/Sender.js:~100] [pc=0x2f08ed0b75f] (this=0x350b3b143ad1 <a Sender with map 0xe238965fee1>,data=0x3eb3eb2c7701 <String[27]: {"method":"data","data":46}>,options=0x3eb3eb2c7789 <an Object with map 0xe2389660569>,cb=0x2ead25f04189 <undefined>)
3: send [/Users/r/.nvm/versions/node/v6.3.0/lib/nod...
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
1: node::Abort() [/Users/r/.nvm/versions/node/v6.3.0/bin/node]
2: node::FatalException(v8::Isolate*, v8::Local<v8::Value>, v8::Local<v8::Message>) [/Users/r/.nvm/versions/node/v6.3.0/bin/node]
3: v8::internal::V8::FatalProcessOutOfMemory(char const*, bool) [/Users/r/.nvm/versions/node/v6.3.0/bin/node]
4: v8::internal::Factory::NewUninitializedFixedArray(int) [/Users/r/.nvm/versions/node/v6.3.0/bin/node]
5: v8::internal::(anonymous namespace)::ElementsAccessorBase<v8::internal::(anonymous namespace)::FastPackedObjectElementsAccessor, v8::internal::(anonymous namespace)::ElementsKindTraits<(v8::internal::ElementsKind)2> >::ConvertElementsWithCapacity(v8::internal::Handle<v8::internal::JSObject>, v8::internal::Handle<v8::internal::FixedArrayBase>, v8::internal::ElementsKind, unsigned int, unsigned int, unsigned int, int) [/Users/r/.nvm/versions/node/v6.3.0/bin/node]
6: v8::internal::(anonymous namespace)::ElementsAccessorBase<v8::internal::(anonymous namespace)::FastPackedObjectElementsAccessor, v8::internal::(anonymous namespace)::ElementsKindTraits<(v8::internal::ElementsKind)2> >::GrowCapacityAndConvertImpl(v8::internal::Handle<v8::internal::JSObject>, unsigned int) [/Users/r/.nvm/versions/node/v6.3.0/bin/node]
7: v8::internal::Runtime_GrowArrayElements(int, v8::internal::Object**, v8::internal::Isolate*) [/Users/r/.nvm/versions/node/v6.3.0/bin/node]
8: 0x2f08dd06338
Abort trap: 6
Yup.. I'll dig some this week.
Thanks for the detailed reporting.
BTW, I also tried a version where I added a significant delay between serial writes -- my hypothesis was that I'd swamped the node/V8 runtime so it didn't have time to do a GC. However, it still bloated up, albeit much more slowly.
Also, after digging around, I thought I'd try to help by running node with the nifty --inspect option:
$ node --inspect `which p5serial`
Debugger listening on port 9229.
Warning: This is an experimental feature and could change at any time.
To start debugging, open the following URL in Chrome:
chrome-devtools://devtools/remote/serve_file/@521e5b7e2b7cc66b4006a8a54cb9c4e57494a5ef/inspector.html?experiments=true&v8only=true&ws=localhost:9229/node
p5.serialserver is running
I then opened the given URL in Chrome and clicked on "Take Heap Snapshot". It reported "Snapshot 1: Snapshotting" but never completed (even before I started hitting serialserver with massive amounts of data).
Am I missing something about how the Chrome tool works?
Shawn:
A helpful netizen suggested I use Chrome Canary to capture the snapshots. Worked great.
So I've captured three Chrome heap snapshots (before running, during running, just about to crash). In the latter two, it's clear that "system/Context" and "(closure)" dominate the bloat, consuming 72% of memory.
I assume that you'd be able to use these (yes)? Since the last one is about a half a gigabyte, I've zipped them together and uploaded them to google docs:
https://drive.google.com/open?id=0B2_wIfsXTC9DaFR0SDlnamhBd2c
Hi, I can duplicate the memory leak that you are experiencing. I also captured heap snapshots with chrome and can see approximately where the leak is happening which I believe is in the WS module which is being used for the socket communication between the server and clients.
Here is some more information about it: https://github.com/websockets/ws/issues/804
The solution seems to be to add "perMessageDeflate: false" to the WS options.
I am trying that now by changing line 24 in p5.serialserver.js to
wss = new WebSocketServer({perMessageDeflate: false, port: SERVER_PORT});
I haven't run it for a long time but so far it seems to be working quite a bit better. Can you also try it?
Turning off perMessageDeflate seems to have quashed the major leak. It seems likely that there's some other leak, but it's much slower. Thank you!
Two follow on questions:
But this is good enough for now. I've patched my copy of p5.serialserver.js.
In the ps gux
output below, you can see that VSZ climbs slowly (should have timestamped the output), but at least RSZ decreases once in a while -- something it definitely wasn't doing before.
Thanks for your sleuth work and follow-up. Let me know if I can help more.
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
r 25821 0.0 0.1 3080972 24440 s002 S+ 3:38AM 0:00.13 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r 25821 64.8 0.2 3132844 31148 s002 R+ 3:38AM 0:12.20 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r 25821 58.0 0.2 3132844 35832 s002 R+ 3:38AM 0:18.08 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r 25821 62.0 0.2 3132844 36564 s002 R+ 3:38AM 0:27.00 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r 25821 62.3 0.2 3132844 36844 s002 S+ 3:38AM 0:36.58 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r 25821 74.4 0.2 3132844 28724 s002 R+ 3:38AM 0:46.93 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r 25821 71.2 0.2 3132844 29828 s002 R+ 3:38AM 0:55.87 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r 25821 49.2 0.2 3132844 39928 s002 R+ 3:38AM 1:05.58 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r 25821 23.3 0.6 3153168 93904 s002 S+ 3:38AM 1:11.16 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r 25821 47.3 0.5 3147320 89608 s002 R+ 3:38AM 1:20.65 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r 25821 45.9 0.5 3147320 89976 s002 R+ 3:38AM 1:27.33 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r 25821 18.1 0.9 3202764 155456 s002 S+ 3:38AM 1:30.67 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r 25821 18.1 0.9 3202612 144948 s002 R+ 3:38AM 1:34.23 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r 25821 48.3 0.8 3193624 136752 s002 R+ 3:38AM 1:40.48 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r 25821 51.6 0.8 3193624 134948 s002 R+ 3:38AM 1:42.83 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r 25821 51.1 0.7 3193624 124980 s002 S+ 3:38AM 1:50.23 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r 25821 50.8 0.7 3193624 122920 s002 R+ 3:38AM 1:57.92 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
Does this mean that serialport opens a new socket for every string? (Sorry, I haven't dug into the code that deeply...)
No, one socket is oped per client but the "perMessageDeflate" would be on each message through that socket.
I gather a real fix depends upon nodejs/node#8871, i.e. not soon, correct?
Right but the above fixes that so we should be safe for now.
Memory usage should grow to a certain point and then stabilize.
Thanks for bringing the issue up and helping to solve!
Reopening because although this is fixed in p5.serialport via this commit: https://github.com/vanevery/p5.serialport/commit/b6b21d648d3400fefda05465978b4c5bdab5b579 p5.serialcontrol still needs an update to incorporate the latest p5.serialport.
It is now updated so closing this issue
I have an Arduino-like device (a Particle Electron) that is continually spitting out strings via USB. When I monitor it using
particle monitor serial
it behaves as expected.When I monitor it using Alpha 5 p5.serialcontrol (and a simple p5 sketch to display the strings), it initially runs fast, but soon starts slowing down and eventually stopping.
Watching its memory usage, p5.serialcontrol starts at 150 MB, but before long has bloated up to > 1GB and eventually crashes.
I suspect a memory leak. LMK if a code example would help.