Open jRichardeau opened 2 years ago
Hi, any news on this ? Thanks
Thanks @jRichardeau for opening this issue and really sorry for late response.
I wonder if this issue still relevant - can you please let me know if you still facing issues if you upgrade to the latest @google-cloud/logging-winston
?
Thank your for your reply @losalex but after testing again i'm still facing the same issues even when upgrading @google-cloud/logging-winston
and winston
to the latest versions.
NB: You may have to change the arrivalRate
to 100 in the example above to reproduce.
Thanks @jRichardeau for response - correct me if I wrong, seems that there was slight improvement with respect of overall rate throughput, but still not enough. One more question - seems you perform testing on your local Mac, correct? If yes - I wonder how much powerful it is and whether you had any chance to test it in other environments like Google Cloud.
There may be a slight improvement, but do you really think it's a power issue? The problem is not that the buffer fills up faster than it empties, but it never empties when there are too many requests, it just grows.
But ok I will try it on GCP.
Here is my local config :
@jRichardeau, thanks for answer - yes, I wanted to eliminate any factors related to HW or connectivity since those usually have the major impact on performance. Also correct me if I wrong, but readable-stream
problem you are raising is not related to logs entries upload, but rather to middleware interaction, correct?
I can't tell you much more about where the issue comes from, I've already spent some time trying to figure out if the problem is in my code or not and it seems that it's not the case.
The only thing I can say is that "when configuring the transport with redirectToStdout
option the problem is not there" so it seems to come from uploading the log entries but can't be sure about it.
Does it happen only when using the middleware ? Probably yes but it would have to be tested.
Did you manage to reproduce the problem with the example above?
Hello,
I finally tested it on GCP and I encountered the same performance issue.
We so forced the redirectToStdout
option on all our services with the new useMessageField: false
option.
We seem to be running into a similar issue with our platform. On GCP we have found that when we use GCP as a sink with either extremely large objects (and/or circular objects), the process eventually crashes with a heap overflow. This occured more frequently on GCP (App Engine) than locally, but we were eventually able to reproduce it locally by pointing to a GCP App Engine logging destination manually.
new LoggingWinston({
labels: {
name: "local",
},
projectId: process.env.PROJECT_ID,
keyFilename: process.env.GCLOUD_APPLICATION_DEFAULT_CREDENTIALS,
defaultCallback: (err) => {
if (err) {
console.error(`GcpLogger Error occurred: ${err}`, err);
}
},
})
We've gone through and reduces our overall object sizes for logs, but we are not comfortable enabling Winston sending to GCP directly at the moment, so we are needing to use console as a sink, which has impacts on performance since my understanding is that console.log is fully synchronous (even on AppEngine).
Notably:
cloudTrace.start()
running for a while (import * as cloudTrace from '@google-cloud/trace-agent';
on our servers, but I believe we were running into it even after this was disabled.We seem to be running into a similar issue with our platform. On GCP we have found that when we use GCP as a sink with either extremely large objects (and/or circular objects), the process eventually crashes with a heap overflow. This occured more frequently on GCP (App Engine) than locally, but we were eventually able to reproduce it locally by pointing to a GCP App Engine logging destination manually.
new LoggingWinston({ labels: { name: "local", }, projectId: process.env.PROJECT_ID, keyFilename: process.env.GCLOUD_APPLICATION_DEFAULT_CREDENTIALS, defaultCallback: (err) => { if (err) { console.error(`GcpLogger Error occurred: ${err}`, err); } }, })
We've gone through and reduces our overall object sizes for logs, but we are not comfortable enabling Winston sending to GCP directly at the moment, so we are needing to use console as a sink, which has impacts on performance since my understanding is that console.log is fully synchronous (even on AppEngine).
Notably:
- We did not have any issues or memory leaks when JUST using Winston sending over Serilog's SEQ data sink.
- We did have the
cloudTrace.start()
running for a while (import * as cloudTrace from '@google-cloud/trace-agent';
on our servers, but I believe we were running into it even after this was disabled.
Hi @mochsner, Are you also using express middleware? Is this issue still happening for you?
Hi @mochsner, It looks like the issue you are describing here is a little different (Wondering if you are using middleware and have high QPS). Could you open a new issue a latest repro environment if it is still happening for you?
Hello,
I'm using
@google-cloud/logging-winston
and theexpress
middleware for logging purpose.When using it on production within a service that receives more than 100 requests/second I see a huge increase of memory usage and latency of the service.
Moreover when configuring the
transport
withredirectToStdout
option the problem is not there.I think I've identified who is holding the requests but I don't know why, it happens in the
readable-stream
module. If we add aconsole.log
in this file/node_modules/readable-stream/lib/_stream_writable.js
in thewriteOrBuffer
function like thisconsole.log('bufferedRequestCount', state.bufferedRequestCount);
, the value ofstate.bufferedRequestCount
is continually growing and so holding all the requests in memory.I think the pain threshold is about 50 requests/s, under that it's ok.
You can reproduce the issue with the example above.
Environment details
@google-cloud/logging-winston
version: 4.2.3@google-cloud/trace-agent
: 5.1.6express
: 4.17.3,winston
: 3.6.0Steps to reproduce
@google-cloud/logging-winston
middleware for expressExample code of express application
I'm using Artillery to call the endpoint with a certain request rate, this is the
yml
configuration file :When stopping the calls the "buffer" is emptied but in production the calls never end 😄 so the memory and latency are increasing.
Thank you for your help