wankdanker / node-odbc

ODBC bindings for node
MIT License
174 stars 79 forks source link

Fixed leaking V8 handles through unnecessary HandleScope.Close() calls #12

Closed Xylem closed 9 years ago

Xylem commented 9 years ago

Calling HandleScope.Close() is unnecessary when its return value is not returned to a calling function. Additionaly, it causes V8 handles to be leaked and causes major memory leaks over time. The issue becomes apparent mainly in long-running and database-intensive applications.

Reproduction

The following code is a minimal program allowing for reproduction of this issue. Apart from odbc, it uses async@0.9.0 and heapdump@0.3.5. In my case, I tested the issue using an instance of Postgres running on the same VM.

var heapdump = require('heapdump');
var async = require('async');
var db = require('odbc')();

db.open('DSN=TestDB;SERVER=localhost;UID=postgres;PWD=postgres', function () {
  async.forever(function (done) {
    db.query('select 1', function (err) {
      done(err);
    });
  });
});

After running it, V8 heap dumps can be generated by issuing

kill -USR2 <PID of node process>

command.

With the version of odbc without these changes, the size of heap dumps of the running process should increase ~3-5MB per minute. When loaded intro Chrome Dev Tools, however, the reported heap size for each of the generated files should not exceed ~2MB. Since the heap dump is simply a JSON file, it can be analyzed manually, which I did. What follows is a description of my analysis of the following heap dump: https://drive.google.com/file/d/0B1gxZN-e6g8ZR09FNGk4TVFVaUE/view?usp=sharing

In array nodes, close to the beginning of the file, you should be able to see a line similar to the following: ,3,9,23,0,3792103. The fields, from left to right, are type (with 3 being object), name, (being index in table strings, in this case: (Handle scope)), id of the node, self_size of the node and finally, edge_count - the number of edges outgoing from that node

Further in the file, in the array edges, there should be a significant number of lines similar to ,3,3415,80. The fields, from left to right, are type (with 3 being internal), name_or_index (again, in this case, referencing index from strings array - contrary to previous case, the index varies on each run, but it always points to string undefined_value) and to_node being the id of a node it connects to. I don't know the exact format of the heap dump, but I assume that the origin node for the edge is calculated based on ordering of nodes in the nodes array and their reported edge_count. In case of the heap dump I'm describing, there were, in total, 3792107 edges referencing the undefined value, which matches very closely the number of edges outgoing from the (Handle scope) node.

For comparison, here's a heap dump of the same program after applying my changes: https://drive.google.com/file/d/0B1gxZN-e6g8ZUkNWc1VOUTZ1eFk/view?usp=sharing

Cause

This part is partially speculation, as I didn't do an in-depth analysis of the V8 innards, but it seems to match the observed behavior

The mechanism of HandleScopes is based on a stack. When a new HandleScope is created, it is put on top of the scope stack. When Handle<T> objects are created (as well as Local<T>), they are being referenced by the HandleScope being currently at the top of the scope stack. Once that HandleScope is removed from the scope stack (e.g. when function it was created in as a local variable returns), it takes care of disposing the handles that exist within it. This is why it is unsafe to simply return Handle<T> objects from a function - since they belonged to a HandleScope that was deleted, anything they pointed to may already not exist. For that reason, the HandleScope.Close() function exists. What it does is takes in a handle from the scope being closed and, after removing the scope from the scope stack, creates a new handle to the same object. Since the scope we just closed is not present on the scope stack, the newly created handle will be created within the HandleScope further down the stack. Here's an example:

Handle<Object> FunctionCalledByJS(...) {
  HandleScope scopeA; // scopeA is put on a scope stack
  Local<Object> objectA; // The Local<Object> handle
                         // is being created within
                         // the scopeA scope

  Local<Object> objectC = InternalFunction(); --> Handle<Object> InternalFunction() {
                                                    HandleScope scopeB; // scopeB is put on a scope stack
                                                    Local<Object> objectB; // Handle is being created
                                                                           // within the scopeB scope
                                                    return scopeB.Close(objectB); // scopeB is removed from the stack
                                                                                  // new Handle<Object> is created in scopeA
                                                                                  // Local<Object> objectB handle is disposed
                                                                                  // newly created handle is returned to function FunctionCalledByJS
                                              <-- }
  return scopeA.Close(objectA); // scopeA is removed from the stack
                                // currently, there is scope of the calling JavaScript code on top of the stack
                                // new Handle<Object> is created in the scope of JS code
                                // Local<Object> objectA handle is disposed
                                // Local<Object> objectC handle is disposed
                                // newly created handle is returned to JS code
}

Now, in the cases that were causing the issue, the following was being done:

void VoidFunction(...) {
  HandleScope scope;

  // code

  scope.Close(Undefined());
}

As you can see, the handle created by the HandleScope.Close() function is not being returned anywhere. This is particularly problematic with all UV_After<Operation> functions, as, I assume, once UV_<Operation> function exits and libuv calls the UV_After<Operation> back in the main application thread, the only scope being present on the handle scope stack is the global handle scope (or other scope that exists throughout the entire lifetime of the process), which causes the handles created by HandleScope.Close() to exist until the process is terminated.

As for the specifics of the memory leak created by the issue, I'm not entirely sure. Partially, it's definitely information about all those handles being stored. I also suspect that it might be the case that V8's garbage collector is simply being crippled by the sheer amount of handles it has to keep track of to the point that it's no longer able to cope with freeing memory allocated by other objects that, under normal conditions, wouldn't cause problems.

Solution

I removed all calls to HandleScope.Close() where the return value was unused. There are no observable ill effects to these changes.

wankdanker commented 9 years ago

Excellent report. I will test out your changes in our environment and see if we have any side effects or hopefully can confirm reduced memory usage in long running processes. If all goes well I will push out a new release next week.

Thank you again. Great write-up.

Xylem commented 9 years ago

For testing purposes, I recommend running SELECT queries that return large number of rows. Due to the fact how ODBCResult::FetchAll() and its counterparts are implemented, after fetching each row, uv_queue_work() is queueing UV_FetchAll() and UV_AfterFetchAll() (which is the one of the affected functions) once again to retrieve the next row, until the result set is exhausted - or at least that's what I get from the brief analysis of the code. Such approach should result in the fastest memory usage growth, as the call to FetchAll() alone should create number_of_rows + 1 handles. This would coincide with the behavior observed in our applications, as the ones making primarily SELECT queries returning large amounts of rows were affected much more (~200MB memory usage growth per hour) than those dealing mainly with INSERTs.