mediachain / aleph

א: The mediachain universe manipulation engine
MIT License
38 stars 16 forks source link

Fix memory leak in publish command #134

Closed yusefnapora closed 7 years ago

yusefnapora commented 7 years ago

It turns out that console.log retains its arguments indefinitely and prevents them from being GC'd, so you can do fancy things with them like inspect them in dev tools.

This was causing us to eventually run out of heap space when publishing millions of records, since I was using console.log to output the statment ids, etc.

This adds a println utility function that calls process.stdout.write, and replaces all the console.logs with println. A lot of the console.logs wouldn't matter, since they're in one-off commands like mcclient id, but I kind of wanted to burn them all to the ground after chasing this leak all day 😄

There's a few other changes to the publish command:

Something I noticed is that for giant ingestions, the default timeout can be too low; I'll get timeouts on data/put commands if I'm a few million records in. Changing the global timeout works, but we should probably add some backoff / retry logic for things like putting data.