Open bskiff opened 6 years ago
We should be able to get this to work without explicitly using the embedder API. Memcached is just using native node async operations setTimeout
or socket callbacks. I'm a little surprised it just doesn't work. I've seen context get lost, but it's usually around how async_hooks
deals with Promises. I can probably get some time to investigate soon. Definitely all for getting this to work with async_hooks
.
I was guessing it might have something to do with the connection pool management in Jackpot, but I don't have any evidence to support that. I can try to get you a simplified example of what I'm seeing if that would help.
To elaborate on the issue I'm seeing, I have a Hapi server where I'm putting contextual information into CLS about a request so that it can be accessed at any point within the lifespan of the request.
When there is a high enough volume of requests, the CLS context from a requests starts getting shared accross separate requests. At the beginning of each request, I generate a unique request id and place it in my context. However, the same request id will show up in other requests indicating that the context boundary isn't working properly. e.g. two requests arrive simultaneously: first request arrives at server, unique request id 'A' is generated and placed into CLS second request arrives at server, unique request id 'B' is generated and placed into CLS Within each request when they attempt to retrieve their request ids, they may both get 'A' or they may both get 'B' even though the CLS context should maintain their own id.
Here is a stripped down example of what I think is causing CLS to get mixed up:
const {executionAsyncId} = require('async_hooks');
const Memcached = require('memcached');
const memcached = new Memcached('localhost:11211');
for(let i = 0; i < 20; i++) {
memcached.get('foo', () => {
console.log(executionAsyncId());
});
}
when I ran this I saw:
16 20 36 32 28 40 44 12 24 8 8 8 8 8 8 8 8 8 8 8
I don't know what is causing all of those 8s, but I'm pretty sure that's the culprit.
It may also be helpful to note, that it appears some state within memcached gets reset after waiting several seconds:
const {executionAsyncId} = require('async_hooks');
const Memcached = require('memcached');
const memcached = new Memcached('localhost:11211');
function executeGet20Times() {
for(let i = 0; i < 20; i++) {
memcached.get('foo', () => {
console.log(executionAsyncId());
});
}
}
executeGet20Times();
setTimeout(() => {
executeGet20Times();
}, 10000);
24 8 32 36 40 28 12 16 20 44 44 44 44 44 44 44 44 44 44 44 153 161 157 165 177 173 169 181 185 189 189 189 189 189 189 189 189 189 189 189
After doing a bit more experimentation, I'm now convinced that this is linked to the connection pool. It seems that unique executionAsyncIds are generated until the connection pool is full, then they start repeating. Adjusting the connection pool size in my above example from the default 10 down to 5:
const {executionAsyncId} = require('async_hooks');
const Memcached = require('memcached');
const memcached = new Memcached('localhost:11211', {poolSize: 5});
function executeGet20Times() {
for(let i = 0; i < 20; i++) {
memcached.get('foo', () => {
console.log(executionAsyncId());
});
}
}
executeGet20Times();
setTimeout(() => {
executeGet20Times();
}, 10000);
yields:
16 12 8 20 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 127 131 123 135 139 139 139 139 139 139 139 139 139 139 139 139 139 139 139 139
We're having related issues; here's the result of some more in-depth tracing of what's going on when you get the repeating-id loop:
https://gist.github.com/dbushong/2954e9cd3fc3d58449a96529817eab03
OK, I have a theory. Here's some code which doesn't use memcached or jackpot: https://gist.github.com/dbushong/330210786fab2f453ce840569f45fd43 and here's the output:
'TCPWRAP' { id: 5, pid: 1 }
'TCPWRAP' { id: 8, pid: 1 }
'TCPWRAP' { id: 10, pid: 1 }
'TCPWRAP' { id: 12, pid: 1 }
'TCPWRAP' { id: 14, pid: 1 }
'TCPCONNECTWRAP' { id: 16, pid: 8 }
'TCPCONNECTWRAP' { id: 17, pid: 5 }
'connect' { j: 1, id: 16 }
'TCPCONNECTWRAP' { id: 20, pid: 10 }
'TCPCONNECTWRAP' { id: 21, pid: 14 }
'TCPCONNECTWRAP' { id: 22, pid: 12 }
'connect' { j: 0, id: 17 }
'data' { j: 1, id: 8 }
'connect' { j: 2, id: 20 }
'connect' { j: 4, id: 21 }
'connect' { j: 3, id: 22 }
'data' { j: 0, id: 5 }
'data' { j: 2, id: 10 }
'data' { j: 4, id: 14 }
'data' { j: 3, id: 12 }
To sum up: first you get a context for TCPWRAP, then inside that you get TCPCONNECTWRAP.... but then your actual data fetch results are inside the TCPWRAP context again.
My suspicion is this: with connection reuse you end up back in the original tcp connection context as each result comes in on the kept-alive connection, thus losing your desired async context. I'll try to come up with a simpler example of this, but I think this might be a fundamental issue around persistent connections + async_hooks.
The async_hooks node API provides an API to get your current asynchronous execution context.
It is leveraged by cls-hooked to provide storage akin to thread-local storage. I have a use-case where I would like to interact with memcached while cls-hooked stores some contextual information. However; this isn't currently possible due to memcached(or possibly one of it's dependencies) not leveraging the async hooks embedder API