googleapis / nodejs-pubsub

Node.js client for Google Cloud Pub/Sub: Ingest event streams from anywhere, at any scale, for simple, reliable, real-time stream analytics.
https://cloud.google.com/pubsub/
Apache License 2.0
516 stars 227 forks source link

Total timeout of API google.pubsub.v1.Publisher exceeded 60000 milliseconds before any response was received. #1906

Closed Kripu77 closed 2 days ago

Kripu77 commented 3 months ago

Hi all,

We're encountering below error:

Total timeout of API google.pubsub.v1.Publisher exceeded 60000 milliseconds before any response was received.

Initially, the service successfully publishes messages to a topic without encountering any errors. However, after the application has been running for a while, typically around 1-2 hours, errors begin to manifest. It appears that the pub/sub client retains all the messages in memory after the error occurs, leading to a gradual increase in container memory usage.

Environment details

Steps to reproduce

Publisher snippet:

async publishEvent(event: Event): Promise<void> {
    try {
      const messageId = await this.topic.publishMessage({
        data: Buffer.from(JSON.stringify(event)),
        attributes: {
          eventType: event.eventType,
          eventVersion: event.eventVersion,
        },
      });
    } catch (error) {
      logger.error(
        `Received error while publishing: ${(error as Error).message}`
      );
    }
  }

Subscirber snippet:

export async function setupPubSubSubscription(
  channel: string,
  callback: (message: string) => void
) {
  const subscription =  pubsub.subscription(channel);

  subscription.on("message", (message) => {
    try {
      callback(message.data.toString());
       message.ack();
    } catch (err) {
      const ackError = err as AckError;
      logger.info({
        error: `Ack for message ${message.id} failed with error: ${ackError.errorCode}`,
      });
    }
  });
  subscription.on("error", (error: Error) => {
    logger.error({
      type: "Pub-Sub Error",
      message: error.message,
      name: error.name,
      stack: error.stack,
    });
  });
  return subscription;
}
Kripu77 commented 3 months ago

Hello everyone,

For those who will follow this issue,

We also use @google-cloud/secret-manager in our services, the version we had before any update:"^4.1.1".

Updating @google-cloud/secret-manager to the latest version available as of today, "^5.3.0", successfully resolved the problem.

This upgrade automatically updated the grpc.js version, which I believe to be the root cause and solution to this issue. Alternatively, it might involve adjustments to other internal dependencies utilized by secret-manager and nodejs-pubsub under the hood.

feywind commented 2 days ago

Thanks for the issue update. The holding more messages as the streams fail is more or less expected, at least until their maximum timeout. The timeout itself sounds transport-related, so a grpc-js update would help there.