googleapis / google-cloud-node

Google Cloud Client Library for Node.js
https://cloud.google.com/nodejs
Apache License 2.0
2.91k stars 591 forks source link

Error: Endpoint read failed at /user_code/node_modules/@google-cloud/logging/node_modules/grpc/src/node/src/client.js:569:15 #2438

Closed deremer closed 5 years ago

deremer commented 7 years ago

We are using Stack Driver to log errors for Firebase functions. We unpredictably get the error in the title. Typically it happens the first to we try to log an error after a function re-deploys. Subsequent error log writes will go through to StackDriver without issue, but occasionally we'll get the error again.

We're using "@google-cloud/logging": "^1.0.2", and deployed via Firebase Functions.

Below is our module that implements the logging...

Anybody have any idea what is causing this?

const Logging = require('@google-cloud/logging');

// To keep on top of errors, we should raise a verbose error report with Stackdriver rather
// than simply relying on console.error. This will calculate users affected + send you email
// alerts, if you've opted into receiving them.
const logging = Logging();

// This is the name of the StackDriver log stream that will receive the log
// entry. This name can be any valid log stream name, but must contain "err"
// in order for the error to be picked up by StackDriver Error Reporting.
const logName:string = 'errors-fb-func';

// Enum of StackDriver severities
enum Severities {
  ERROR = 500, // ERROR (500) Error events are likely to cause problems.
  CRITICAL = 600, // CRITICAL   (600) Critical events cause more severe problems or outages.
  ALERT = 700, // ALERT (700) A person must take an action immediately.
  EMERGENCY = 800 // EMERGENCY  (800) One or more systems are unusable.
}

// Provide an error object and and optional context object
export function log(err:Error, logLevel:number=Severities.ERROR, user?:string): Promise<any> {
  // https://cloud.google.com/functions/docs/monitoring/error-reporting#advanced_error_reporting

  const FUNCTION_NAME = process.env.FUNCTION_NAME;
  const log = logging.log(logName);

  const metadata = {
    // MonitoredResource
    // See https://cloud.google.com/logging/docs/api/ref_v2beta1/rest/v2beta1/MonitoredResource
    resource: {
      // MonitoredResource.type
      type: 'cloud_function',
      // MonitoredResource.labels
      labels: {
       function_name: FUNCTION_NAME
      }
    },
    severity: logLevel
  };

  const context:any = {};
  if (user && typeof user === 'string') {
    // ErrorEvent.context.user
    context.user = user;
  }

  // ErrorEvent
  // See https://cloud.google.com/error-reporting/reference/rest/v1beta1/ErrorEvent
  let structPayload:any = {
    // ErrorEvent.serviceContext
    serviceContext: {
      // ErrorEvent.serviceContext.service
      service: `cloud_function:${FUNCTION_NAME}`,
      // ErrorEvent.serviceContext.version
      resourceType: 'cloud_function'
    },

  };

  if (context) {
    // ErrorEvent.context
    structPayload.context = context;
  }

  structPayload.message = getMsgForError(err);

  return writeLog(log, metadata, structPayload);
}

function getMsgForError(error:Error): string {
  // https://cloud.google.com/functions/docs/monitoring/error-reporting#advanced_error_reporting
  // ErrorEvent.message
  if (error instanceof Error && typeof error.stack === 'string') {
    return error.stack;
  } else if (typeof error === 'string') {
    return error;
  } else if (typeof error.message === 'string') {
    return error.message;
  } else {
    logFatalError(error, "Error message type not supported");
    return "";
  }
}

function writeLog(log:any, metadata:any, structPayload:any): Promise<any> {
  console.log(metadata);
  console.log(structPayload);
  // Write the error log entry
  return new Promise((resolve, reject) => {
    try {
      log.write(log.entry(metadata, structPayload), (error:any) => {
        if (error) {
          logFatalError(error);
          reject(error);
        }
        resolve();
      });
    } catch(error) {
      reject(error);
    }
  });
}

// Utility function to log error if Logger fails
function logFatalError(error:Error, msg?:string): void {
  console.error(error, msg);
  throw error;
}

// error, crtical, alert, emergency, accept an Error object
// And then set error.stack as the message
export function error(error:Error, user?:string): Promise<any> {
  return log(error, Severities.ERROR, user);
}

export function critical(error:Error, user?:string): Promise<any> {
  return log(error, Severities.CRITICAL, user);
}

export function alert(error:Error, user?:string): Promise<any> {
  return log(error, Severities.ALERT, user);
}

export function emergency(error:Error, user?:string): Promise<any> {
  return log(error, Severities.EMERGENCY, user);
}
stephenplusplus commented 6 years ago

We've just released these modules which include the upgrade to gRPC 1.9. This should correct all problems reported in this issue.

@google-cloud/datastore v1.3.5 @google-cloud/logging v1.1.7 @google-cloud/pubsub v1.6.5 @google-cloud/spanner v1.2.0

gcooney commented 6 years ago

Was seeing this issue using @google-cloud/monitoring v0.4.1 to write some metrics via a cloud function. Tried upgrading to the not yet released 0.5.0 that upgrades to grpc v1.9.1 and I'm seeing a similar but not identical error message.

Error: 14 UNAVAILABLE: TCP Read failed
at createStatusError (/user_code/node_modules/@google-cloud/monitoring/node_modules/grpc/src/client.js:64)
at (/user_code/node_modules/@google-cloud/monitoring/node_modules/grpc/src/client.js:583)

Is this a different manifestation of the same issue or should I open a new ticket?

murgatroid99 commented 6 years ago

That is a closely related error. As mentioned, it is now getting reported with the UNAVAILABLE status code. I thought the Google Cloud library would retry requests that fail with that status code.

designxtek commented 6 years ago

I'm seeing

{ Error: 14 UNAVAILABLE: TCP Read failed at new createStatusError (/user_code/node_modules/@google-cloud/datastore/node_modules/grpc/src/client.js:64:15) at /user_code/node_modules/@google-cloud/datastore/node_modules/grpc/src/client.js:583:15 code: 14, metadata: Metadata { _internal_repr: {} }, details: 'TCP Read failed' }

When trying to save() on datastore via GCF.

WaldoJeffers commented 6 years ago

@stephenplusplus Sadly, I'm still seeing this issue. As many people mentioned it, it only happens on the first request. I'm using @google-cloud/spanner v1.4.0

The full stack trace is:

A  0|app      | Error: Cannot find module '/app/node_modules/@google-cloud/spanner/node_modules/@google-cloud/common-grpc/node_modules/grpc/src/node/src/client.js'

A  0|app      |     at Function.Module._resolveFilename (module.js:542:15)
A  0|app      |     at Module._load (module.js:472:25)
A  0|app      |     at loadAndPatch (/app/node_modules/@google-cloud/trace-agent/build/src/trace-plugin-loader.js:141:36)
A  0|app      |     at Function.Module_load [as _load] (/app/node_modules/@google-cloud/trace-agent/build/src/trace-plugin-loader.js:173:37)
A  0|app      |     at Module.require (module.js:585:17)
A  0|app      |     at require (internal/module.js:11:18)
A  0|app      |     at Object.<anonymous> (/app/node_modules/@google-cloud/spanner/node_modules/@google-cloud/common-grpc/src/service.js:26:12)
A  0|app      |     at Module._compile (module.js:641:30)
A  0|app      |     at Object.Module._extensions..js (module.js:652:10)
A  0|app      |     at Module.load (module.js:560:32)
ofrobots commented 6 years ago

@WaldoJeffers based on the error mesage that seems an unrelated error. Can you open a new issue for this?

kylehotchkiss commented 6 years ago

Same Issue with logging-bunyan package ran from Firebase Functions.

try-catch block doesn't appear to catch these either, so if it errors, the function quits.

stephenhandley commented 6 years ago

Also seeing an issue using logging-bunyan within Firebase Functions

Error: 14 UNAVAILABLE: TCP Read failed 
at Object.exports.createStatusError (/user_code/node_modules/@google-cloud/logging-bunyan/node_modules/grpc/src/common.js:87:15) 
at Object.onReceiveStatus (/user_code/node_modules/@google-cloud/logging-bunyan/node_modules/grpc/src/client_interceptors.js:1188:28)
at InterceptingListener._callNext (/user_code/node_modules/@google-cloud/logging-bunyan/node_modules/grpc/src/client_interceptors.js:564:42) 
at InterceptingListener.onReceiveStatus (/user_code/node_modules/@google-cloud/logging-bunyan/node_modules/grpc/src/client_interceptors.js:614:8) 
at callback (/user_code/node_modules/@google-cloud/logging-bunyan/node_modules/grpc/src/client_interceptors.js:841:24)
victor5114 commented 6 years ago

Hello. I'm using @google-cloud/datastore (v1.4.1) to write entities through a stream. I'm having no issue running the code locally but I get an error when I run the on a cloud function.

Here's my log trace :

 insertId:  "000000-080c5482-2dd4-4ee6-9227-3f15ae868137"  
 labels: {…}  
 logName:  "xxx"
 receiveTimestamp:  "2018-08-16T16:42:27.680546524Z"  
 resource: {…}  
 severity:  "INFO"  
 textPayload:  "{ Error: 14 UNAVAILABLE: TCP Read failed
    at Object.exports.createStatusError (/srv/node_modules/grpc/src/common.js:87:15)
    at Object.onReceiveStatus (/srv/node_modules/grpc/src/client_interceptors.js:1188:28)
    at InterceptingListener._callNext (/srv/node_modules/grpc/src/client_interceptors.js:564:42)
    at InterceptingListener.onReceiveStatus (/srv/node_modules/grpc/src/client_interceptors.js:614:8)
    at callback (/srv/node_modules/grpc/src/client_interceptors.js:841:24)
  code: 14,
  metadata: Metadata { _internal_repr: {} },
  details: 'TCP Read failed' }"  
 timestamp:  "2018-08-16T16:42:20.211Z"  
}

The only difference I can see between the two different environments setup is that I'm passing a credential object required by Google libs to connect to Google Service (Eg. projectId, keyFilename) when I run locally. Nothing is passed on cloud function environment as it's supposed to be implicit Auth across GCP services. (This has been verified with using @google-cloud/storage lib both local and function)

JustinBeckwith commented 5 years ago

Well, oh bother. I can reliably reproduce this. I have an app engine standard nodejs app that seems to always toss a 500 on my first request after using it for the first time in a while. The output in cloud logging looks like this:

A  { Error: 14 UNAVAILABLE: TCP Read failed 
A      at Object.exports.createStatusError (/srv/node_modules/grpc/src/common.js:91:15) 
A      at Object.onReceiveStatus (/srv/node_modules/grpc/src/client_interceptors.js:1204:28) 
A      at InterceptingListener._callNext (/srv/node_modules/grpc/src/client_interceptors.js:568:42) 
A      at InterceptingListener.onReceiveStatus (/srv/node_modules/grpc/src/client_interceptors.js:618:8) 
A      at callback (/srv/node_modules/grpc/src/client_interceptors.js:845:24) 
A    code: 14, 
A    metadata: Metadata { _internal_repr: {} }, 
A    details: 'TCP Read failed' } 

This is happening when trying to do a simple read from firestore:

const db = new Firestore();
const jobs = db.collection('jobs');
await jobs.doc(job.id).set(job)

After doing a refresh and making the second call, all appears to be working fine. This does not happen locally, or in a docker container.

@stephenplusplus @alexander-fenster @murgatroid99 @ofrobots

I can give folks access to my project if this is useful. I am going to work on an independent repo. FWIW, the app is: https://npmtrace.appspot.com

You have to search for a package name/version that hasn't been indexed. You can find the sources here: https://github.com/JustinBeckwith/npmtrace

jasonpolites commented 5 years ago

Smells a little like an internal networking issue (TCP protocol support surface), which may be unrelated to GRPC or client libraries specifically. @JustinBeckwith perhaps create an internal bug and reference this issue for context?

sergioregueira commented 5 years ago

Same error here using @google-cloud/logging: 4.1.1 on GAE NodeJS8 Standard Environment.

Error: 14 UNAVAILABLE: TCP Read failed 
  at Object.exports.createStatusError (/srv/node_modules/grpc/src/common.js:87:15) 
  at Object.onReceiveStatus (/srv/node_modules/grpc/src/client_interceptors.js:1188:28) 
  at InterceptingListener._callNext (/srv/node_modules/grpc/src/client_interceptors.js:564:42) 
  at InterceptingListener.onReceiveStatus (/srv/node_modules/grpc/src/client_interceptors.js:614:8) 
  at callback (/srv/node_modules/grpc/src/client_interceptors.js:841:24) 

First errors detected in three different projects in the first week of December.

choipd commented 5 years ago

I've got the same error message in Firebase Functions's log:

{ Error: 14 UNAVAILABLE: TCP Read failed
    at Object.exports.createStatusError (/user_code/node_modules/firebase-admin/node_modules/grpc/src/common.js:87:15)
    at Object.onReceiveStatus (/user_code/node_modules/firebase-admin/node_modules/grpc/src/client_interceptors.js:1188:28)
    at InterceptingListener._callNext (/user_code/node_modules/firebase-admin/node_modules/grpc/src/client_interceptors.js:564:42)
    at InterceptingListener.onReceiveStatus (/user_code/node_modules/firebase-admin/node_modules/grpc/src/client_interceptors.js:614:8)
    at callback (/user_code/node_modules/firebase-admin/node_modules/grpc/src/client_interceptors.js:841:24)
  code: 14,
  metadata: Metadata { _internal_repr: {} },
  details: 'TCP Read failed' }
kyle-mccarthy commented 5 years ago

I am getting this error as well, GAE standard node 10 environment. I am using @google-cloud/logging-winston which is is using @google-cloud/logging@4.0.1.

Error: 14 UNAVAILABLE: TCP Read failed
    at Object.exports.createStatusError (/srv/node_modules/grpc/src/common.js:87:15)
    at Object.onReceiveStatus (/srv/node_modules/grpc/src/client_interceptors.js:1188:28)
    at InterceptingListener._callNext (/srv/node_modules/grpc/src/client_interceptors.js:564:42)
    at InterceptingListener.onReceiveStatus (/srv/node_modules/grpc/src/client_interceptors.js:614:8)
    at callback (/srv/node_modules/grpc/src/client_interceptors.js:841:24)
jigneshMitel commented 5 years ago

I'm also facing the same issue where after some idle time, grpc throws UNAVAILABLE error. is there a way to work around this problem as I didn't find any solution on this thread.

stephenplusplus commented 5 years ago

@JustinBeckwith RE: https://github.com/googleapis/google-cloud-node/issues/2438#issuecomment-450755179 -- have you filed an issue or found anything?

JustinBeckwith commented 5 years ago

Yeah, there's an internal bug on cloud functions here. They're doing some things to increase a timeout, but this issue absolutely still exists.

rhodgkins commented 5 years ago

@JustinBeckwith does that also apply to GAE NodeJS8 Standard Environment as well?

JustinBeckwith commented 5 years ago

Possibly. Are you seeing a similar callstack in an app engine standard app? cc @steren

rhodgkins commented 5 years ago

I can't remember/find the stack trace I'm afraid (I starred this issue after this comment https://github.com/googleapis/google-cloud-node/issues/2438#issuecomment-454587844) :(

I originally came here because of googleapis/nodejs-logging-winston#190, and just reverted back to a previous version of @google-cloud/logging-winston.

kyle-mccarthy commented 5 years ago

@JustinBeckwith fwiw I am having this issue in GAE Standard with node 10. Do you know if there is an open issue within google/or their issue tracker? I am using the nodejs-logging-winston and see the TCP read failed and at the same time my redis connection to a GCE instance also has problems and gets a ECONNRESET.

- 2019-03-06 14:38:04.758 CST    [ioredis] Error: read ECONNRESET ... at TCP.onStreamRead (internal/stream_base_commons.js:111:27)
- 2019-03-06 14:43:27.776 CST    Winston encountered an error - Error: 14 UNAVAILABLE: TCP Read failed

Is it an issue in how internal communication is handled?

deremer commented 5 years ago

Not sure if this is unrelated, since the original issue was related to logging

Recently, we've been getting these more in our Node 8 cloud functions. We've isolated it down to a part of the code that writes to Firestore.

It may be coincidental, but there seems to be a pattern. We tend to get this error in batches (i.e., many of them around the same time and then it stops). It also tends to happen around the time our app's load spikes. I have a hypothesis that during traffic bursts there may be some issue setting up resources as instances our scaled? Just a wild guess...

Error: 14 UNAVAILABLE: TCP Read failed
at Object.exports.createStatusError (common.js:87)
at Object.onReceiveStatus (/srv/node_modules/grpc/src/client_interceptors.js:1188)
at InterceptingListener._callNext (/srv/node_modules/grpc/src/client_interceptors.js:564)
at InterceptingListener.onReceiveStatus (/srv/node_modules/grpc/src/client_interceptors.js:614)
at callback (/srv/node_modules/grpc/src/client_interceptors.js:841)
JustinBeckwith commented 5 years ago

Greetings folks! I think we've resolved this issue. With the latest and greatest version of all libraries, you should now be getting a dependency on @grpc/grpc-js instead of grpc. This module is rewritten from the ground up, and uses the native HTTP/2 support in node core. We think the combination of this new module, and some fixes to timeouts on the Cloud Functions/App Engine backend should resolve the issue.

If you are running into this still ... please let us know! Just make sure you're using the latest version of the module, and that you post a stack trace (as above).

Edit: I was wrong

juanparadox commented 5 years ago

Any work arounds for this issue?

JustinBeckwith commented 5 years ago

Not currently, but we're making progress on a fix

alexander-fenster commented 5 years ago

Hi folks,

We made some changes in @grpc/grpc-js https://github.com/grpc/grpc-node/pull/1021 that might fix the problem described here. They are releases as v0.5.3. Can I bother you updating, checking if the issue is still here with v0.5.3, and sending the stack trace back if something still does not work?

Thanks!

bcoe commented 5 years ago

@deremer :wave: hey, I just wanted to check in and see if @alexander-fenster's changes in @grpc/grpc-js helped?

deremer commented 5 years ago

@deremer 👋 hey, I just wanted to check in and see if @alexander-fenster's changes in @grpc/grpc-js helped?

We haven’t had a release to deploy with updated dependencies yet. I will report back once we do so and get some data. Lots of other folks on here too so they might see results sooner.

bcoe commented 5 years ago

@deremer @juanparadox @kyle-mccarthy, can you folks confirm that the latest release of @google-cloud/logging has addressed this issue for you?

bcoe commented 5 years ago

@deremer I'm going to go ahead and close this issue, because it's been almost a month, and seems like things have been working okay for folks.

Please feel free to re-open if you are continuing to bump into these issues.

felixmpaulus commented 4 years ago

This error popped up this morning for the first time.

{ Error: 14 UNAVAILABLE: TCP Read failed
    at Object.exports.createStatusError (/srv/node_modules/grpc/src/common.js:91:15)
    at Object.onReceiveStatus (/srv/node_modules/grpc/src/client_interceptors.js:1204:28)
    at InterceptingListener._callNext (/srv/node_modules/grpc/src/client_interceptors.js:568:42)
    at InterceptingListener.onReceiveStatus (/srv/node_modules/grpc/src/client_interceptors.js:618:8)
    at callback (/srv/node_modules/grpc/src/client_interceptors.js:845:24)
  code: 14,
  metadata: Metadata { _internal_repr: {} },
  details: 'TCP Read failed' }
JustinBeckwith commented 4 years ago

Greetings! Could you open a new issue with details?