Add support for the `async_hooks` node API

bskiff commented 6 years ago

The async_hooks node API provides an API to get your current asynchronous execution context.

It is leveraged by cls-hooked to provide storage akin to thread-local storage. I have a use-case where I would like to interact with memcached while cls-hooked stores some contextual information. However; this isn't currently possible due to memcached(or possibly one of it's dependencies) not leveraging the async hooks embedder API

kevin-greene-ck commented 6 years ago

We should be able to get this to work without explicitly using the embedder API. Memcached is just using native node async operations setTimeout or socket callbacks. I'm a little surprised it just doesn't work. I've seen context get lost, but it's usually around how async_hooks deals with Promises. I can probably get some time to investigate soon. Definitely all for getting this to work with async_hooks.

bskiff commented 6 years ago

I was guessing it might have something to do with the connection pool management in Jackpot, but I don't have any evidence to support that. I can try to get you a simplified example of what I'm seeing if that would help.

bskiff commented 6 years ago

To elaborate on the issue I'm seeing, I have a Hapi server where I'm putting contextual information into CLS about a request so that it can be accessed at any point within the lifespan of the request.

When there is a high enough volume of requests, the CLS context from a requests starts getting shared accross separate requests. At the beginning of each request, I generate a unique request id and place it in my context. However, the same request id will show up in other requests indicating that the context boundary isn't working properly. e.g. two requests arrive simultaneously: first request arrives at server, unique request id 'A' is generated and placed into CLS second request arrives at server, unique request id 'B' is generated and placed into CLS Within each request when they attempt to retrieve their request ids, they may both get 'A' or they may both get 'B' even though the CLS context should maintain their own id.

Here is a stripped down example of what I think is causing CLS to get mixed up:

const {executionAsyncId} = require('async_hooks');
const Memcached = require('memcached');
const memcached = new Memcached('localhost:11211');

for(let i = 0; i < 20; i++) {
    memcached.get('foo', () => {
        console.log(executionAsyncId());
    });
}

when I ran this I saw:

16 20 36 32 28 40 44 12 24 8 8 8 8 8 8 8 8 8 8 8

I don't know what is causing all of those 8s, but I'm pretty sure that's the culprit.

bskiff commented 6 years ago

It may also be helpful to note, that it appears some state within memcached gets reset after waiting several seconds:

const {executionAsyncId} = require('async_hooks');
const Memcached = require('memcached');
const memcached = new Memcached('localhost:11211');

function executeGet20Times() {
    for(let i = 0; i < 20; i++) {
        memcached.get('foo', () => {
            console.log(executionAsyncId());
        });
    }    
}

executeGet20Times();

setTimeout(() => {
    executeGet20Times();
}, 10000);

24 8 32 36 40 28 12 16 20 44 44 44 44 44 44 44 44 44 44 44 153 161 157 165 177 173 169 181 185 189 189 189 189 189 189 189 189 189 189 189

bskiff commented 6 years ago

After doing a bit more experimentation, I'm now convinced that this is linked to the connection pool. It seems that unique executionAsyncIds are generated until the connection pool is full, then they start repeating. Adjusting the connection pool size in my above example from the default 10 down to 5:

const {executionAsyncId} = require('async_hooks');
const Memcached = require('memcached');
const memcached = new Memcached('localhost:11211', {poolSize: 5});

function executeGet20Times() {
    for(let i = 0; i < 20; i++) {
        memcached.get('foo', () => {
            console.log(executionAsyncId());
        });
    }    
}

executeGet20Times();

setTimeout(() => {
    executeGet20Times();
}, 10000);

yields:

16 12 8 20 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 127 131 123 135 139 139 139 139 139 139 139 139 139 139 139 139 139 139 139 139

dbushong commented 4 years ago

We're having related issues; here's the result of some more in-depth tracing of what's going on when you get the repeating-id loop:

https://gist.github.com/dbushong/2954e9cd3fc3d58449a96529817eab03

dbushong commented 4 years ago

OK, I have a theory. Here's some code which doesn't use memcached or jackpot: https://gist.github.com/dbushong/330210786fab2f453ce840569f45fd43 and here's the output:

'TCPWRAP' { id: 5, pid: 1 }
'TCPWRAP' { id: 8, pid: 1 }
'TCPWRAP' { id: 10, pid: 1 }
'TCPWRAP' { id: 12, pid: 1 }
'TCPWRAP' { id: 14, pid: 1 }
'TCPCONNECTWRAP' { id: 16, pid: 8 }
'TCPCONNECTWRAP' { id: 17, pid: 5 }
'connect' { j: 1, id: 16 }
'TCPCONNECTWRAP' { id: 20, pid: 10 }
'TCPCONNECTWRAP' { id: 21, pid: 14 }
'TCPCONNECTWRAP' { id: 22, pid: 12 }
'connect' { j: 0, id: 17 }
'data' { j: 1, id: 8 }
'connect' { j: 2, id: 20 }
'connect' { j: 4, id: 21 }
'connect' { j: 3, id: 22 }
'data' { j: 0, id: 5 }
'data' { j: 2, id: 10 }
'data' { j: 4, id: 14 }
'data' { j: 3, id: 12 }

To sum up: first you get a context for TCPWRAP, then inside that you get TCPCONNECTWRAP.... but then your actual data fetch results are inside the TCPWRAP context again.

My suspicion is this: with connection reuse you end up back in the original tcp connection context as each result comes in on the kept-alive connection, thus losing your desired async context. I'll try to come up with a simpler example of this, but I think this might be a fundamental issue around persistent connections + async_hooks.

3rd-Eden / memcached

Add support for the `async_hooks` node API #331