googleapis / nodejs-logging

Node.js client for Stackdriver Logging: Store, search, analyze, monitor, and alert on log data and events from Google Cloud Platform and Amazon Web Services (AWS).
https://cloud.google.com/logging/
Apache License 2.0
171 stars 62 forks source link

Regular error showing Total timeout of API google.logging.v2.LoggingServiceV2 exceeded 600000 milliseconds #1185

Closed yhernesto closed 2 years ago

yhernesto commented 2 years ago

Hi, Im getting a similar error:

C:\Users\Lenovo\myproject\node_modules\google-gax\build\src\normalCalls\retries.js:66
                const error = new googleError_1.GoogleError(`Total timeout of API ${apiName} exceeded ${retry.backoffSettings.totalTimeoutMillis} milliseconds before any response was received.`);
                              ^

GoogleError: Total timeout of API google.logging.v2.LoggingServiceV2 exceeded 60000 milliseconds before any response was received.
    at repeat (C:\Users\Lenovo\myproject\node_modules\google-gax\build\src\normalCalls\retries.js:66:31)
    at Timeout._onTimeout (C:\Users\Lenovo\myproject\node_modules\google-gax\build\src\normalCalls\retries.js:101:25)
    at listOnTimeout (node:internal/timers:557:17)
    at processTimers (node:internal/timers:500:7) {
  code: 4
}

Im using: "@google-cloud/language": "^4.2.8", "@google-cloud/logging": "^9.6.2", "@google-cloud/storage": "^5.15.0",

with: "google-gax": "^2.28.1".

I'm running a nestjs app and using Google Logging for NodeJS

Thanks a lot in advance for your help!

Originally posted by @yhernesto in https://github.com/googleapis/nodejs-logging/issues/972#issuecomment-968173656

varungbt commented 2 years ago

seeing this error in our logs too from time to time.

yaakovfeldman commented 2 years ago

I get this error too if a request came to my app (in local development) when the internet was disconnected. This is obviously not likely to occur in production, but it is possibly unexpected behaviour for the app to crash whenever the logging library is unable to reach the GCP endpoint? (Especially given the 'fire and forget' approach recommended in the docs.)

This may be correct behaviour, but I assume a lot of users will want to set up some sort of global error catching for this particular error.

niekcandaele commented 2 years ago

Also seeing this error, using "@google-cloud/logging-winston": "^4.1.1". The app has internet connectivity at all times but still the error happens

jyu-hca commented 2 years ago

Hi, I'm also getting this error in google cloud function.

node v12, "@google-cloud/logging": "^9.2.2", "@google-cloud/logging-winston": "^4.0.4", pub/sub cloud function

The cloud function finished with status 'ok' at 18:40:27, and then it arises an Unhandled rejection and timeout error at 18:41:45

image image

bhr commented 2 years ago

Seeing the same issue on multiple Cloud Run instances.

Using:

    "@google-cloud/error-reporting": "^2.0.4",
    "@google-cloud/logging": "^9.6.3",
    "@google-cloud/logging-winston": "^4.1.1",
    "winston": "^3.3.3"
jdhurst commented 2 years ago

I'm also seeing this issue in a GKE environment:

"pino-stackdriver": "^2.1.1"
"@google-cloud/logging": "^8.0.5",
samuellessa commented 2 years ago

Currently using these versions with node 12.

"@google-cloud/firestore": "^4.15.1", "@google-cloud/logging": "^9.6.7", "@google-cloud/pubsub": "^2.7.0",

Same error

Regular error showing Total timeout of API google.logging.v2.LoggingServiceV2 exceeded 600000 milliseconds

Try to update and downgrade every lib, nothing seems to work.

Tried to install different versions of google-gax and grpc, and use when creating new Logging ({grpc}).

Found no solution, is there any way to mute/catch this error?

I tried to find responses on GitHub issues and stackoverflow.

No one has a clear answer of how to resolve this error. 😭😭😭

losalex commented 2 years ago

Thanks @yhernesto for filing this issue and everyone for adding comments! I have some questions, it would be great if you can answer those - feel free to comment on each case individually:

  1. Which GCP environment is used?
  2. Does error happen constantly or in bursts? What is a pattern of error you see on your end?
  3. Any idea if that error happens when reading or writing?
  4. Is logs volume read/written is big? If doing reads, it is possible that query is done for large data?
  5. Are machines performing logging request suffer from CPU/memory issues?
losalex commented 2 years ago

One more thing to mention - if your code is running in serverless environment (e.g. Cloud Functions or Cloud Run), it is important to remember that serverless environment have only compute/memory resources for the duration of the request, so if you do something like: logger.log('foo'); rather than: await logger.log('foo'); you might get timeouts.

There are some alternatives which can be used:

  1. logSync class is recommended for usage in serverless environment.
  2. Another option for Cloud Functions or Cloud Run is to use JSON.stringify as described here.

There is a good blog post which provides tips for running Node.js in Cloud Funcrions/Cloud Run here.

losalex commented 2 years ago

I believe there is a possibility to catch an error by using log.write with callback, for example:

      log.write(entry, (err) => {
         if (err) {
           // The log entry was not written.
          console.log(err.message);
         }
       });   

Anyone can try this one and see if callback solution helps to avoid crashes?

aarongirard commented 2 years ago

I am also seeing this issue. I am running a nodejs server locally using Winston. I can repro this bug by disconnecting from the internet and logging something. Error below:

Error: Total timeout of API google.logging.v2.LoggingServiceV2 exceeded 60000 milliseconds before any response was received.

If you're using winston, you can resolve this bug by adding support for handling promise rejections. Make sure to also set {...exitOnError: false} in your logger config.

losalex commented 2 years ago

I believe the fix 1225 should address the issue - I added a global callback for delete and write calls which was also suggested by @yaakovfeldman and also listed more possibilities for error handling in Error handling with logs written or deleted asynchronously section of README. Closing an issue for now - please reactivate or comment if there anything else needs a help.

edonosotti commented 2 years ago

Still getting a lot of these in the logs.

The code is running as a Firebase Function in GCP:

I don't have "@google-cloud/logging" though, I found 10.1.1 in node_modules, likely a dependency of one of the above.

MilesConn commented 2 years ago

If you're seeing this issue often and you're in a serverless environment like Cloud Run you should enable writing to stdout so that log writes aren't async. This is documented here

evil-shrike commented 1 year ago

having the same issue in my cloud function

Error: Total timeout of API google.logging.v2.LoggingServiceV2 exceeded 60000 milliseconds before any response was received.
    at repeat (/workspace/node_modules/google-gax/build/src/normalCalls/retries.js:66:31)
    at Timeout._onTimeout (/workspace/node_modules/google-gax/build/src/normalCalls/retries.js:101:25)
    at listOnTimeout (node:internal/timers:559:17)
    at processTimers (node:internal/timers:502:7)
    "@google-cloud/logging": "^10.4.0",
    "@google-cloud/logging-winston": "^5.3.0",

My code uses a winston logger with LoggingWinston transport from "@google-cloud/logging-winston":

export function createCloudLogger() {
  const cloudLogger = winston.createLogger({
    level: LOG_LEVEL,
    format: format.combine(
      format((info) => {
        info.trace = process.env.TRACE_ID;
        info[LOGGING_TRACE_KEY] = process.env.TRACE_ID;
        return info;
      })(),
    ),
    defaultMeta: getDefaultMetadataForTracing(),
    transports: [
      new LoggingWinston({
        projectId: process.env.GCP_PROJECT,
        labels: {
          component: <string>process.env.LOG_COMPONENT,
        },
        logName: "mylog",
        resource: {
          labels: {
            function_name: <string>process.env.K_SERVICE,
          },
          type: "cloud_function",
        },
        useMessageField: false,
        redirectToStdout: false,
      }),
    ],
  });
  return cloudLogger;
}

then I log using winston's methods like logger.info, logger.verbose. The methods are sync, so there's no await.