Closed ddolcimascolo closed 2 years ago
Hi @trentm,
I'm willing to help diagnose the issue. What can I provide to further track down where the leak come from?
Cheers, David
Side note, I see in the diff between 3.27.0 and 3.28.0 that you supported exit spans for ES (very nice BTW :pray:). Dunno if that's related or not, but we do use extended @elastic/elasticsearch
Client
and Transport
classes.
Client
to register some event handler for logging and plug-in our custom Tansport
class Client extends elasticsearch.Client {
constructor() {
super({
Transport // Our custom Transport class
});
this.registerEventLoggers();
}
registerEventLoggers() {
this.on('request', (err, result) => {
const { connection: { url }, attempts, request: { id, params: { method, path, querystring, body } } } = result.meta;
logger.debug('elasticsearch/%s >> (%d) %s %s%s?%s %s', id, attempts, method, url, path, querystring, body, { err });
});
this.on('response', (err, result) => {
const { statusCode, body, meta: { request: { id } } } = result;
logger.debug('elasticsearch/%s << %d %j', id, statusCode, body, { err });
});
this.on('sniff', (err, result) => {
const { meta: { request: { id }, sniff: { reason, hosts } } } = result;
logger.info('elasticsearch/%s << Sniffing (%s) returned %j', id, reason, hosts, { err });
});
}
}
Transport
to override request(params, options, callback)
to put some request options automatically and also unwrap the response body from the result object. This is what the legacy client was doing and we kept that when we upgraded to @elastic/elasticsearch
client. Fun fact: this is what the client will do by default in v8 :)class Transport extends elasticsearch.Transport {
request(params, options, callback) {
/**
* For search requests we append ?timeout=xxx as a query parameter
* This allows to transparently use a global configuration setting for all requests
*/
if (/_search$/.test(params.path)) {
_.defaultsDeep(params, {
querystring: {
timeout: `${config.ES_REQUEST_TIMEOUT}ms`
}
});
}
const result = super.request(params, options, callback);
if (!callback && !(options && options.asStream)) {
return result.then(({ body }) => body); // Unwraps body from the returned result, to emulate what the old ES client does
}
return result;
}
}
After reverting the lib to v3.27.0
@ddolcimascolo Thanks for reporting! Ugh, sorry about this. Yes, the best guess is probably the changes in the @elastic/elasticsearch
instrumentation in commit 10f533e878f8b04781ab407a0ba9dc04909861a3 (#2557). The immediately preceding change is commit cff14650b3a5af12add139bc401fb2461a9b341d.
"elastic-apm-node": "elastic/apm-agent-nodejs#cff14650b3a5af12add139bc401fb2461a9b341d"
If that does not show the leak, then that almost certainly narrows it down to commit 10f533e8.
The way I've done this in the past is to add this signal handler to my node.js app:
const path = require('path')
const fs = require('fs')
const v8 = require('v8')
console.warn(`"node ${path.basename(process.argv[1])}" is running as PID ${process.pid}: use 'kill -SIGUSR2 ${process.pid}' to create a heapsnapshot` )
process.on('SIGUSR2', function () {
var stamp = new Date().toISOString().replace(/:/g, '-').split('.')[0]
v8.getHeapSnapshot()
.pipe(fs.createWriteStream(`leak-${stamp}.heapsnapshot`))
.on('close', function () {
console.warn(`Created leak-${stamp}.heapsnapshot`)
})
})
kill -SIGUSR2 PID
to tell it to generate the heap snapshot. Note that this'll pause the process while it makes that dump.This will contain internal details about your app, so I understand if you cannot share the file. Or you could share it with me privately and I'd delete it after looking.
I'll get trying to reproduce this shortly.
I'll get trying to reproduce this shortly.
I can repro with a simple express app that queries ES:
var AGENT_DIR = './'
// AGENT_DIR = '/Users/trentm/el/apm-agent-nodejs'
tracer = require(AGENT_DIR).start({
serviceName: 'leak3280play',
// serverUrl: 'http://host.docker.internal:8200', // when in docker
logUncaughtExceptions: true,
metricsInterval:'0s',
breakdownMetrics:false,
centralConfig:false,
cloudProvider:'none',
captureHeaders:false,
})
console.warn(`Using elastic-apm-node@${require(AGENT_DIR+'/package.json').version}`)
const { Client } = require('@elastic/elasticsearch')
const express = require('express')
const app = express()
const port = 3000
const client = new Client({
node: `http://${process.env.ES_HOST || 'localhost'}:9200`
})
app.get('/', (req, res, next) => {
client.search({ q: 'pants' }, function (err, result) {
if (err) {
next(err)
} else {
res.send(result)
}
})
})
app.listen(port, () => {
console.log(`listening at http://localhost:${port}`)
})
I've moved the "latest" release of the agent back to v3.27.0 for now.
Thx for checking @trentm. That was fast! I'll try to set things up on my side to get you a heap dump ASAP.
I can confirm that we're using express too, with some routes that bulk index documents and some others that query ES, in a way similar to your repo above.
David
@ddolcimascolo You can skip getting a heap dump. I understand the issue. I am just working through testing now. I am embarrassed at my mistake that caused this. Thanks for all the details!
elastic-apm-node@3.29.0 has been published with what I understand to be a fix for this.
@trentm I can confirm that v3.29.0 fixes it. Thx again
Describe the bug
High memory usage in v3.28.0, up to a point where the Node processes become unresponsive. Restarting reproduces.
I just reverted to v3.27.0 and will observe charts for a few hours. I'll report here
In the screenshot below (from Kibana APM app itself), elastic-apm-node upgrade was merged on Feb. 9th at 3am (Paris time).This process is automatic. The app was restarted the same day at around 1pm
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Nothing special, APM keeps on collecting data as usual.
Environment (please complete the following information)
How are you starting the agent? (please tick one of the boxes)
agent.start()
directly (e.g.require('elastic-apm-node').start(...)
)elastic-apm-node/start
from within the source code-r elastic-apm-node/start
Additional context
package.json
dependencies: