p5-serial / p5.serialcontrol

Desktop application and server for p5.serialport.js
Other
57 stars 28 forks source link

Memory leak? #12

Closed rdpoor closed 7 years ago

rdpoor commented 7 years ago

I have an Arduino-like device (a Particle Electron) that is continually spitting out strings via USB. When I monitor it using particle monitor serial it behaves as expected.

When I monitor it using Alpha 5 p5.serialcontrol (and a simple p5 sketch to display the strings), it initially runs fast, but soon starts slowing down and eventually stopping.

Watching its memory usage, p5.serialcontrol starts at 150 MB, but before long has bloated up to > 1GB and eventually crashes.

I suspect a memory leak. LMK if a code example would help.

rdpoor commented 7 years ago

Additional info: running the command line version of the node app:

$ p5serial

... displays identical behavior: bloating and eventually crashing. So maybe I'm reporting this bug in the wrong place?

vanevery commented 7 years ago

Would love a bit more information. What platform and version? Are you running a p5.js sketch or have you just been seeing starting with the p5.serialcontrol and/or commandline version?

A code example would be great.

rdpoor commented 7 years ago

Hi @vanevery:

Mostly complete code example (including "how to replicate") here:

https://gist.github.com/rdpoor/42829a3d224a151172e1b9668468bfbb

As for versions:

vanevery commented 7 years ago

Ok.. Thanks for writing all of that up. I'll try it out in the next day or two.

I have noticed that the chrome console generally can't handle a great deal of data and often ends up crashing or becoming unresponsive. I know you said that the "node" process is the one with the issue but it is it possible that it is chrome?

Can you take out the console.log(serial.readLine()); and just do a serial.readLine(); in your sketch just to make sure?

rdpoor commented 7 years ago

To be clear, I was watching node with ps gux so I know it was node and not Chrome that was bloating (unless Chrome changed its name! ;).

But this is frustrating. Now when I run p5serial, I'm getting:

$ p5serial
p5.serialserver is running
Error: Error: Device not configured, cannot open /dev/cu.usbmodem1411
    at Error (native)

I can't think of any changes I made since testing the gist, and I don't know any other processes that are holding the USB port open. I can run my simple USB monitor to verify that there's data streaming in via USB.

I have to take care of a few other things and will get back to this when I can. Let me know if you have any thoughts in the meantime! Thanks...

vanevery commented 7 years ago

Ok, probably is something with p5.serialserver.. I'll take a look.

rdpoor commented 7 years ago

Nah - it was brain bubble. When I restarted everything, the device is now on cu.usbmodem118. Standby...

rdpoor commented 7 years ago

Got your smoking gun right here (with the console.log() removed):

[~]$ date ; ps guxw 16710
Wed Aug  2 08:03:12 PDT 2017
USER   PID  %CPU %MEM      VSZ    RSS   TT  STAT STARTED      TIME COMMAND
r    16710   0.0  0.1  3060492  24096 s000  S+    8:02AM   0:00.13 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
[~]$ date ; ps guxw 16710
Wed Aug  2 08:03:42 PDT 2017
USER   PID  %CPU %MEM      VSZ    RSS   TT  STAT STARTED      TIME COMMAND
r    16710 103.3  1.0  3232248 168580 s000  R+    8:02AM   0:18.99 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
[~]$ date ; ps guxw 16710
Wed Aug  2 08:04:25 PDT 2017
USER   PID  %CPU %MEM      VSZ    RSS   TT  STAT STARTED      TIME COMMAND
r    16710 101.0  2.7  3508852 448028 s000  R+    8:02AM   1:03.16 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
[~]$ date ; ps guxw 16710
Wed Aug  2 08:05:21 PDT 2017
USER   PID  %CPU %MEM      VSZ    RSS   TT  STAT STARTED      TIME COMMAND
r    16710 100.4  4.8  3874396 813372 s000  R+    8:02AM   1:59.54 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
[~]$ date ; ps guxw 16710
Wed Aug  2 08:05:53 PDT 2017
USER   PID  %CPU %MEM      VSZ    RSS   TT  STAT STARTED      TIME COMMAND
r    16710 100.0  6.3  4120312 1060592 s000  R+    8:02AM   2:32.73 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
[~]$ 

... after which p5serial unceremoniously died:

$ p5serial
p5.serialserver is running

<--- Last few GCs --->

  268833 ms: Mark-sweep 1346.7 (1396.1) -> 1343.0 (1401.1) MB, 699.6 / 0 ms (+ 51.9 ms in 95 steps since start of marking, biggest step 3.8 ms) [allocation failure] [GC in old space requested].
  269769 ms: Mark-sweep 1343.0 (1401.1) -> 1342.8 (1403.1) MB, 935.8 / 0 ms [allocation failure] [GC in old space requested].
  270697 ms: Mark-sweep 1342.8 (1403.1) -> 1342.7 (1403.1) MB, 928.5 / 0 ms [last resort gc].
  271647 ms: Mark-sweep 1342.7 (1403.1) -> 1342.7 (1403.1) MB, 950.1 / 0 ms [last resort gc].

<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0x2ead25fc9e59 <JS Object>
    2: send [/Users/r/.nvm/versions/node/v6.3.0/lib/node_modules/p5.serialserver/node_modules/ws/lib/Sender.js:~100] [pc=0x2f08ed0b75f] (this=0x350b3b143ad1 <a Sender with map 0xe238965fee1>,data=0x3eb3eb2c7701 <String[27]: {"method":"data","data":46}>,options=0x3eb3eb2c7789 <an Object with map 0xe2389660569>,cb=0x2ead25f04189 <undefined>)
    3: send [/Users/r/.nvm/versions/node/v6.3.0/lib/nod...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
 1: node::Abort() [/Users/r/.nvm/versions/node/v6.3.0/bin/node]
 2: node::FatalException(v8::Isolate*, v8::Local<v8::Value>, v8::Local<v8::Message>) [/Users/r/.nvm/versions/node/v6.3.0/bin/node]
 3: v8::internal::V8::FatalProcessOutOfMemory(char const*, bool) [/Users/r/.nvm/versions/node/v6.3.0/bin/node]
 4: v8::internal::Factory::NewUninitializedFixedArray(int) [/Users/r/.nvm/versions/node/v6.3.0/bin/node]
 5: v8::internal::(anonymous namespace)::ElementsAccessorBase<v8::internal::(anonymous namespace)::FastPackedObjectElementsAccessor, v8::internal::(anonymous namespace)::ElementsKindTraits<(v8::internal::ElementsKind)2> >::ConvertElementsWithCapacity(v8::internal::Handle<v8::internal::JSObject>, v8::internal::Handle<v8::internal::FixedArrayBase>, v8::internal::ElementsKind, unsigned int, unsigned int, unsigned int, int) [/Users/r/.nvm/versions/node/v6.3.0/bin/node]
 6: v8::internal::(anonymous namespace)::ElementsAccessorBase<v8::internal::(anonymous namespace)::FastPackedObjectElementsAccessor, v8::internal::(anonymous namespace)::ElementsKindTraits<(v8::internal::ElementsKind)2> >::GrowCapacityAndConvertImpl(v8::internal::Handle<v8::internal::JSObject>, unsigned int) [/Users/r/.nvm/versions/node/v6.3.0/bin/node]
 7: v8::internal::Runtime_GrowArrayElements(int, v8::internal::Object**, v8::internal::Isolate*) [/Users/r/.nvm/versions/node/v6.3.0/bin/node]
 8: 0x2f08dd06338
Abort trap: 6
vanevery commented 7 years ago

Yup.. I'll dig some this week.

Thanks for the detailed reporting.

rdpoor commented 7 years ago

BTW, I also tried a version where I added a significant delay between serial writes -- my hypothesis was that I'd swamped the node/V8 runtime so it didn't have time to do a GC. However, it still bloated up, albeit much more slowly.

rdpoor commented 7 years ago

Also, after digging around, I thought I'd try to help by running node with the nifty --inspect option:

$ node --inspect `which p5serial`
Debugger listening on port 9229.
Warning: This is an experimental feature and could change at any time.
To start debugging, open the following URL in Chrome:
    chrome-devtools://devtools/remote/serve_file/@521e5b7e2b7cc66b4006a8a54cb9c4e57494a5ef/inspector.html?experiments=true&v8only=true&ws=localhost:9229/node
p5.serialserver is running

I then opened the given URL in Chrome and clicked on "Take Heap Snapshot". It reported "Snapshot 1: Snapshotting" but never completed (even before I started hitting serialserver with massive amounts of data).

Am I missing something about how the Chrome tool works?

rdpoor commented 7 years ago

Shawn:

A helpful netizen suggested I use Chrome Canary to capture the snapshots. Worked great.

So I've captured three Chrome heap snapshots (before running, during running, just about to crash). In the latter two, it's clear that "system/Context" and "(closure)" dominate the bloat, consuming 72% of memory.

I assume that you'd be able to use these (yes)? Since the last one is about a half a gigabyte, I've zipped them together and uploaded them to google docs:

https://drive.google.com/open?id=0B2_wIfsXTC9DaFR0SDlnamhBd2c
vanevery commented 7 years ago

Hi, I can duplicate the memory leak that you are experiencing. I also captured heap snapshots with chrome and can see approximately where the leak is happening which I believe is in the WS module which is being used for the socket communication between the server and clients.

Here is some more information about it: https://github.com/websockets/ws/issues/804

The solution seems to be to add "perMessageDeflate: false" to the WS options.
I am trying that now by changing line 24 in p5.serialserver.js to wss = new WebSocketServer({perMessageDeflate: false, port: SERVER_PORT});

I haven't run it for a long time but so far it seems to be working quite a bit better. Can you also try it?

rdpoor commented 7 years ago

Turning off perMessageDeflate seems to have quashed the major leak. It seems likely that there's some other leak, but it's much slower. Thank you!

Two follow on questions:

But this is good enough for now. I've patched my copy of p5.serialserver.js.

In the ps gux output below, you can see that VSZ climbs slowly (should have timestamped the output), but at least RSZ decreases once in a while -- something it definitely wasn't doing before.

Thanks for your sleuth work and follow-up. Let me know if I can help more.

USER   PID  %CPU %MEM      VSZ    RSS   TT  STAT STARTED      TIME COMMAND
r    25821   0.0  0.1  3080972  24440 s002  S+    3:38AM   0:00.13 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r    25821  64.8  0.2  3132844  31148 s002  R+    3:38AM   0:12.20 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r    25821  58.0  0.2  3132844  35832 s002  R+    3:38AM   0:18.08 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r    25821  62.0  0.2  3132844  36564 s002  R+    3:38AM   0:27.00 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r    25821  62.3  0.2  3132844  36844 s002  S+    3:38AM   0:36.58 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r    25821  74.4  0.2  3132844  28724 s002  R+    3:38AM   0:46.93 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r    25821  71.2  0.2  3132844  29828 s002  R+    3:38AM   0:55.87 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r    25821  49.2  0.2  3132844  39928 s002  R+    3:38AM   1:05.58 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r    25821  23.3  0.6  3153168  93904 s002  S+    3:38AM   1:11.16 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r    25821  47.3  0.5  3147320  89608 s002  R+    3:38AM   1:20.65 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r    25821  45.9  0.5  3147320  89976 s002  R+    3:38AM   1:27.33 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r    25821  18.1  0.9  3202764 155456 s002  S+    3:38AM   1:30.67 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r    25821  18.1  0.9  3202612 144948 s002  R+    3:38AM   1:34.23 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r    25821  48.3  0.8  3193624 136752 s002  R+    3:38AM   1:40.48 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r    25821  51.6  0.8  3193624 134948 s002  R+    3:38AM   1:42.83 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r    25821  51.1  0.7  3193624 124980 s002  S+    3:38AM   1:50.23 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
r    25821  50.8  0.7  3193624 122920 s002  R+    3:38AM   1:57.92 node /Users/r/.nvm/versions/node/v6.3.0/bin/p5serial
vanevery commented 7 years ago

Does this mean that serialport opens a new socket for every string? (Sorry, I haven't dug into the code that deeply...)

No, one socket is oped per client but the "perMessageDeflate" would be on each message through that socket.

I gather a real fix depends upon nodejs/node#8871, i.e. not soon, correct?

Right but the above fixes that so we should be safe for now.

Memory usage should grow to a certain point and then stabilize.

Thanks for bringing the issue up and helping to solve!

vanevery commented 7 years ago

Reopening because although this is fixed in p5.serialport via this commit: https://github.com/vanevery/p5.serialport/commit/b6b21d648d3400fefda05465978b4c5bdab5b579 p5.serialcontrol still needs an update to incorporate the latest p5.serialport.

vanevery commented 7 years ago

It is now updated so closing this issue