grpc / grpc-node

gRPC for Node.js
https://grpc.io
Apache License 2.0
4.48k stars 649 forks source link

Tracking issue for grpc-js invalid metadata errors #1173

Closed murgatroid99 closed 10 months ago

murgatroid99 commented 5 years ago

This is a tracking issue for the following error message output by @grpc/grpc-js:

Failed to add metadata entry <key>: <values>. <Extra error information>. For more information see https://github.com/grpc/grpc-node/issues/1173.

This is not a fatal error. Most applications should work as normal after seeing this. This error message indicates that a key/value pair has been omitted from the response metadata object as a result of some kind of data corruption.

This problem was originally reported at googleapis/nodejs-datastore#415. So far, all detailed reports have a timestamp as the value in the error message.

If you see this message with timestamp metadata values when making a request to a Google server from a Google Cloud server using a Google Cloud API client library or Firebase, please add a +1 reaction to this comment to help us track how frequently it happens

If you received this message using a different library or from accessing a different server, please share the following information to help us find the root cause of the error:

In addition, if you can get such a thing it would be very helpful to have a tcpdump of an unencrypted TCP session that triggers this error.

npomfret commented 5 years ago

Using:

$ npm ls | grep grpc-js
│ │ ├─┬ @grpc/grpc-js@0.6.9

I don't know the library exactly, I have the following:

$ npm ls | grep google
├─┬ @google-cloud/firestore@2.6.0
│ ├─┬ google-gax@1.7.5
│ │ ├── google-auth-library@5.5.1 deduped
├─┬ @google-cloud/logging@5.5.5
│ ├─┬ @google-cloud/common@2.2.3
│ │ ├── @google-cloud/projectify@1.0.1 deduped
│ │ ├── @google-cloud/promisify@1.0.2 deduped
│ │ ├── google-auth-library@5.5.1 deduped
│ ├─┬ @google-cloud/paginator@2.0.1
│ ├── @google-cloud/projectify@1.0.1
│ ├── @google-cloud/promisify@1.0.2
│ ├─┬ google-auth-library@5.5.1
│ │ │ ├─┬ google-p12-pem@2.0.2
│ ├── google-gax@1.7.5 deduped
├─┬ @google-cloud/logging-bunyan@2.0.0
│ ├─┬ @google-cloud/logging@5.5.4
│ │ ├── @google-cloud/common@2.2.3 deduped
│ │ ├── @google-cloud/paginator@2.0.1 deduped
│ │ ├── @google-cloud/projectify@1.0.1 deduped
│ │ ├── @google-cloud/promisify@1.0.2 deduped
│ │ ├── google-auth-library@5.5.1 deduped
│ │ ├── google-gax@1.7.5 deduped
│ └── google-auth-library@5.5.1 deduped
├─┬ @google-cloud/pubsub@1.1.5
│ ├── @google-cloud/paginator@2.0.1 deduped
│ ├── @google-cloud/precise-date@1.0.1
│ ├── @google-cloud/projectify@1.0.1 deduped
│ ├── @google-cloud/promisify@1.0.2 deduped
│ ├── google-auth-library@5.5.1 deduped
│ ├── google-gax@1.7.5 deduped
│ ├─┬ @google-cloud/firestore@2.5.0
│ │ ├── google-gax@1.7.5 deduped
│ ├─┬ @google-cloud/storage@3.5.0
│ │ ├── @google-cloud/common@2.2.3 deduped
│ │ ├── @google-cloud/paginator@2.0.1 deduped
│ │ ├── @google-cloud/promisify@1.0.2 deduped
│ │ │ ├── google-auth-library@5.5.1 deduped

Full error (from stackdriver console):

Error: Failed to add metadata entry @���: Wed, 13 Nov 2019 07:13:16 GMT. Metadata key "@���" contains illegal characters at validate (/home/nodeapp/apps/crypto-trader-prototype/shared-code/node_modules/@grpc/grpc-js/build/src/metadata.js:35:15) at Metadata.add (/home/nodeapp/apps/crypto-trader-prototype/shared-code/node_modules/@grpc/grpc-js/build/src/metadata.js:87:9) at values.split.forEach.v (/home/nodeapp/apps/crypto-trader-prototype/shared-code/node_modules/@grpc/grpc-js/build/src/metadata.js:226:63) at Array.forEach (<anonymous>) at Object.keys.forEach.key (/home/nodeapp/apps/crypto-trader-prototype/shared-code/node_modules/@grpc/grpc-js/build/src/metadata.js:226:43) at Array.forEach (<anonymous>) at Function.fromHttp2Headers (/home/nodeapp/apps/crypto-trader-prototype/shared-code/node_modules/@grpc/grpc-js/build/src/metadata.js:200:30) at ClientHttp2Stream.stream.on (/home/nodeapp/apps/crypto-trader-prototype/shared-code/node_modules/@grpc/grpc-js/build/src/call-stream.js:211:56) at ClientHttp2Stream.emit (events.js:193:13) at emit (internal/http2/core.js:244:8) at processTicksAndRejections (internal/process/task_queues.js:81:17)

Related to https://github.com/googleapis/nodejs-pubsub/issues/818 maybe.

murgatroid99 commented 5 years ago

Thank you for the report. That looks consistent with the other reports we have seen: like the others, the key is 4 apparently random characters and the value is a formatted timestamp.

sempi commented 4 years ago

We are experiencing the same issue.

Failed to add metadata entry €��: Sat, 07 Dec 2019 14:56:38 GMT. Metadata key "€��" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173

Is there anything we can do to get the fix for this accelerated?

murgatroid99 commented 4 years ago

As mentioned in the original issue, please say what library you are using, and if possible what the domain name is of the server you are accessing.

Other than that, the most helpful thing you or anyone else could provide is a full tcpdump or similar of an unencrypted TCP session that triggers this error.

sempi commented 4 years ago

We are using @google-cloud/firestore@2.5.0 and grpc@1.24.2.

The issue is related to using firestore APIs.

It does not say which server it is connected when the error happens.

node_modules/@google-cloud/firestore/node_modules/@grpc/grpc-js/build/src/metadata.js

Line 31

        const result = new Metadata();
        Object.keys(headers).forEach(key => {
            // Reserved headers (beginning with `:`) are not valid keys.
            if (key.charAt(0) === ':') {
                return;
            }
            const values = headers[key];
            try {
                if (isBinaryKey(key)) {
                    if (Array.isArray(values)) {
                        values.forEach(value => {
                            result.add(key, Buffer.from(value, 'base64'));
                        });
                    }
                    else if (values !== undefined) {
                        values.split(',').forEach(v => {
                            result.add(key, Buffer.from(v.trim(), 'base64'));
                        });
                    }
                }
                else {
                    if (Array.isArray(values)) {
                        values.forEach(value => {
                            result.add(key, value);
                        });
                    }
                    else if (values !== undefined) {
                        values.split(',').forEach(v => result.add(key, v.trim()));
                    }
                }
            }
            catch (error) {
                error.message = `Failed to add metadata entry ${key}: ${values}. ${error.message}`;
                process.emitWarning(error);
            }
        });
        return result;
    }

We are performing several million operations per day and this happens frequently across independent firestore instances. So it's very likely that this is a software issue present on all your servers or in the client library. Here is a sample across two regions with originating region (where the request is made). There is two firestore instances. One in us-central1 and one in europe-west1.

image

sempi commented 4 years ago

Also the error is just logged to the console. The consuming firestore layer does not properly subscribe to this error and propagate it. It therefor fails silently.

We also re-read and check the data we wrote. So it's possible that this is an issue that does not affect the data being written or read since we have not detected any corruption on the data and we check every record by reading it back after writing it.

sempi commented 4 years ago

Also within firestore it uses @grpc/grpc-js@0.6.9

So I'd assume it actually uses that implementation of grpc not the grpc package we use otherwise.

sempi commented 4 years ago

We won't be able to give you a tcpdump due to the volume of data and frequency of occurrence and other operational considerations. But if you have a beta version of firestore that has more info when the error happens then we can give that a try in our staging environment.

murgatroid99 commented 4 years ago

Thank you for the extra details. From that log image it looks like you are seeing this error probably 20-30 times out of the several million requests you make per day. Would you say that is accurate?

I want to note that so far we believe this corruption only impacts headers, and we have only seen reports so far of it impacting the date header. So you should not worry about your data being corrupted by this bug.

Overall, your report is consistent with the other reports of this bug that we have seen, so unfortunately that does not help us significantly with figuring out what is wrong.

jcurlier commented 4 years ago

Same issue on several nodejs applications on GKE clusters

image

@grpc/grpc-js "version": "0.6.14"

I am using Google Pub-Sub ("@google-cloud/pubsub": "1.1.6") and Winston for logging on all the container having the issue.

If you have a debug version of the @grpc/grpc-js library that you'd like me to run to gather more information, I would be happy to do.

786 jeromecurlier:Ingestor (master)$ npm ls | grep google
├─┬ @google-cloud/bigquery@4.5.0
│ ├─┬ @google-cloud/common@2.2.5
│ │ ├── @google-cloud/projectify@1.0.4 deduped
│ │ ├── @google-cloud/promisify@1.0.4 deduped
│ │ ├── google-auth-library@5.7.0 deduped
│ ├─┬ @google-cloud/paginator@2.0.3
│ ├── @google-cloud/promisify@1.0.4
├─┬ @google-cloud/pubsub@1.1.6
│ ├── @google-cloud/paginator@2.0.3 deduped
│ ├── @google-cloud/precise-date@1.0.3
│ ├── @google-cloud/projectify@1.0.4
│ ├── @google-cloud/promisify@1.0.4 deduped
│ ├─┬ google-auth-library@5.7.0
│ │ │ ├─┬ google-p12-pem@2.0.3
│ ├─┬ google-gax@1.12.0
│ │ ├── google-auth-library@5.7.0 deduped
jakeleventhal commented 4 years ago

Same issue showing up here https://github.com/googleapis/nodejs-firestore/issues/829

yatingoswami commented 4 years ago

same issue happening over several million messages

`│ ├─┬ @grpc/grpc-js@0.5.2
│ │ ├─┬ @grpc/grpc-js@0.6.15`
[](`Failed to add metadata entry @: Thu, 02 Jan 2020 19:16:00 GMT. Metadata key "@" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
Failed to add metadata entry @: Thu, 02 Jan 2020 20:27:14 GMT. Metadata key "@" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
Failed to add metadata entry @: Thu, 02 Jan 2020 20:35:07 GMT. Metadata key "@" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
Failed to add metadata entry @: Fri, 03 Jan 2020 03:57:36 GMT. Metadata key "@" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
Failed to add metadata entry @: Fri, 03 Jan 2020 04:08:07 GMT. Metadata key "@" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
Failed to add metadata entry @: Fri, 03 Jan 2020 07:44:08 GMT. Metadata key "@" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
Failed to add metadata entry @: Fri, 03 Jan 2020 07:57:17 GMT. Metadata key "@" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
Failed to add metadata entry @: Fri, 03 Jan 2020 17:37:07 GMT. Metadata key "@" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
Failed to add metadata entry @: Sat, 04 Jan 2020 07:37:37 GMT. Metadata key "@" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
Failed to add metadata entry @: Sat, 04 Jan 2020 22:47:31 GMT. Metadata key "@" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
Failed to add metadata entry @: Sun, 05 Jan 2020 10:57:13 GMT. Metadata key "@" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
Failed to add metadata entry @: Sun, 05 Jan 2020 17:23:52 GMT. Metadata key "@" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173` )
murgatroid99 commented 4 years ago

@yatingoswami I just want to make sure I understand: is that the total number of times you observed that warning in several million requests?

kbariotis commented 4 years ago

We are using @google-cloud/pubsub@1.2 and the error we are getting is:

"Failed to add metadata entry A: Tue, 28 Jan 2020 18:17:14 GMT. Metadata key "a" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173

"Failed to add metadata entry ¾æÔ: Tue, 28 Jan 2020 20:06:29 GMT. Metadata key "¾æô" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173

arvinsingla commented 4 years ago

We're using "@google-cloud/pubsub": "1.1.6"

Failed to add metadata entry A: Tue, 11 Feb 2020 14:22:24 GMT. Metadata key "a" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173

Failed to add metadata entry ~ÞÏ: Tue, 11 Feb 2020 23:21:07 GMT. Metadata key "~þï" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
genisd commented 4 years ago

using "@google-cloud/pubsub@1.2.0": here

Failed to add metadata entry A: Wed, 12 Feb 2020 20:34:43 GMT. Metadata key "a" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
Failed to add metadata entry àFp9: Wed, 12 Feb 2020 20:40:59 GMT. Metadata key "àfp9" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173

together with (not sure which if any of these are relevant) google-gax-1.14.1 google-auth-library-5.9.2 @grpc/grpc-js-0.6.15 @grpc/proto-loader-0.5.3

rodneycjones commented 4 years ago

I think this started happening after we started using Firestore batch updates (NodeJSv10 on CloudRun) "@grpc/grpc-js": {"version": "0.6.9"

2020-02-26 12:57:30.772 PST
(node:14) Error: Failed to add metadata entry @���: Wed, 26 Feb 2020 20:57:30 GMT. Metadata key "@���" contains illegal characters
2020-02-26 10:09:39.183 PST
(node:14) Error: Failed to add metadata entry @���: Wed, 26 Feb 2020 18:09:39 GMT. Metadata key "@���" contains illegal characters
2020-02-26 08:00:20.433 PST
(node:14) Error: Failed to add metadata entry @���: Wed, 26 Feb 2020 16:00:20 GMT. Metadata key "@���" contains illegal characters
2020-02-26 07:03:17.985 PST
(node:14) Error: Failed to add metadata entry @���: Wed, 26 Feb 2020 15:03:17 GMT. Metadata key "@���" contains illegal characters
2020-02-26 06:37:15.100 PST
(node:14) Error: Failed to add metadata entry @���: Wed, 26 Feb 2020 14:37:15 GMT. Metadata key "@���" contains illegal characters
2020-02-25 13:59:37.114 PST
(node:14) Error: Failed to add metadata entry @���: Tue, 25 Feb 2020 21:59:37 GMT. Metadata key "@���" contains illegal characters
2020-02-25 12:17:12.270 PST
(node:14) Error: Failed to add metadata entry @���: Tue, 25 Feb 2020 20:17:12 GMT. Metadata key "@���" contains illegal characters
2020-02-25 09:46:35.209 PST
(node:14) Error: Failed to add metadata entry @���: Tue, 25 Feb 2020 17:46:35 GMT. Metadata key "@���" contains illegal characters
2020-02-25 08:58:09.110 PST
(node:14) Error: Failed to add metadata entry @���: Tue, 25 Feb 2020 16:58:09 GMT. Metadata key "@���" contains illegal characters
2020-02-25 06:09:58.931 PST
(node:14) Error: Failed to add metadata entry @���: Tue, 25 Feb 2020 14:09:58 GMT. Metadata key "@���" contains illegal characters
2020-02-24 14:13:59.617 PST
(node:14) Error: Failed to add metadata entry @���: Mon, 24 Feb 2020 22:13:59 GMT. Metadata key "@���" contains illegal characters
2020-02-24 14:13:38.182 PST
(node:14) Error: Failed to add metadata entry @���: Mon, 24 Feb 2020 22:13:38 GMT. Metadata key "@���" contains illegal characters
2020-02-24 14:13:30.502 PST
(node:14) Error: Failed to add metadata entry @���: Mon, 24 Feb 2020 22:13:30 GMT. Metadata key "@���" contains illegal characters
2020-02-24 10:59:55.654 PST
(node:14) Error: Failed to add metadata entry @���: Mon, 24 Feb 2020 18:59:55 GMT. Metadata key "@���" contains illegal characters
2020-02-24 08:14:54.309 PST
(node:14) Error: Failed to add metadata entry @���: Mon, 24 Feb 2020 16:14:54 GMT. Metadata key "@���" contains illegal characters
2020-02-21 10:52:25.382 PST
(node:14) Error: Failed to add metadata entry @���: Fri, 21 Feb 2020 18:52:25 GMT. Metadata key "@���" contains illegal characters
2020-02-21 05:35:27.945 PST
(node:14) Error: Failed to add metadata entry @���: Fri, 21 Feb 2020 13:35:27 GMT. Metadata key "@���" contains illegal characters
2020-02-21 04:59:22.798 PST
(node:14) Error: Failed to add metadata entry @���: Fri, 21 Feb 2020 12:59:22 GMT. Metadata key "@���" contains illegal characters
2020-02-20 06:08:29.630 PST
(node:14) Error: Failed to add metadata entry @���: Thu, 20 Feb 2020 14:08:29 GMT. Metadata key "@���" contains illegal characters
2020-02-20 02:37:58.947 PST
(node:14) Error: Failed to add metadata entry @���: Thu, 20 Feb 2020 10:37:58 GMT. Metadata key "@���" contains illegal characters
npm ls | grep google
├── @google-cloud/connect-datastore@3.1.0
├─┬ @google-cloud/datastore@4.4.0
│ ├── @google-cloud/projectify@1.0.1
│ ├── @google-cloud/promisify@1.0.2
│ ├── google-auth-library@5.5.0 deduped
│ ├─┬ google-gax@1.7.5
│ │ ├── google-auth-library@5.5.0 deduped
├─┬ @google-cloud/debug-agent@4.0.4
│ ├─┬ @google-cloud/common@2.2.2
│ │ ├── @google-cloud/projectify@1.0.1 deduped
│ │ ├── @google-cloud/promisify@1.0.2 deduped
│ │ ├── google-auth-library@5.5.0 deduped
├─┬ @google-cloud/logging-winston@2.1.0
│ ├─┬ @google-cloud/logging@5.4.1
│ │ ├─┬ @google-cloud/common-grpc@1.0.7
│ │ │ ├── @google-cloud/common@2.2.2 deduped
│ │ │ ├── @google-cloud/projectify@1.0.1 deduped
│ │ │ ├── @google-cloud/promisify@1.0.2 deduped
│ │ ├── @google-cloud/paginator@2.0.1 deduped
│ │ ├── @google-cloud/projectify@1.0.1 deduped
│ │ ├── @google-cloud/promisify@1.0.2 deduped
│ │ ├── google-auth-library@5.5.0 deduped
│ │ ├── google-gax@1.7.5 deduped
│ ├── google-auth-library@5.5.0 deduped
├─┬ @google-cloud/monitoring@1.4.0
│ └── google-gax@1.7.5 deduped
├─┬ @google-cloud/pubsub@1.1.5
│ ├─┬ @google-cloud/paginator@2.0.1
│ ├── @google-cloud/precise-date@1.0.1
│ ├── @google-cloud/projectify@1.0.1 deduped
│ ├── @google-cloud/promisify@1.0.2 deduped
│ ├── google-auth-library@5.5.0 deduped
│ ├── google-gax@1.7.5 deduped
├─┬ @google-cloud/storage@3.5.0
│ ├── @google-cloud/common@2.2.2 deduped
│ ├── @google-cloud/paginator@2.0.1 deduped
│ ├── @google-cloud/promisify@1.0.2 deduped
│ │ ├── google-auth-library@5.5.0 deduped
├─┬ @google-cloud/tasks@1.5.0
│ └── google-gax@1.7.5 deduped
├─┬ @google-cloud/trace-agent@4.2.2
│ ├── @google-cloud/common@2.2.2 deduped
│ ├─┬ @google-cloud/firestore@2.4.0
│ │ ├── google-gax@1.7.5 deduped
│ ├── @google-cloud/storage@3.5.0 deduped
├─┬ google-auth-library@5.5.0
│ │ ├─┬ google-p12-pem@2.0.2
├─┬ googleapis@42.0.0
│ ├── google-auth-library@5.5.0 deduped
│ └─┬ googleapis-common@3.1.1
│   ├── google-auth-library@5.5.0 deduped
├─┬ passport-google-oauth20@2.0.0
icco commented 4 years ago
Failed to add metadata entry A: Sat, 25 Apr 2020 04:18:09 GMT. Metadata key "a" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
$ npm ls | grep google

├─┬ @google-cloud/logging-bunyan@2.0.3
│ ├─┬ @google-cloud/logging@7.3.0
│ │ ├─┬ @google-cloud/common@2.4.0
│ │ │ ├── @google-cloud/projectify@1.0.4 deduped
│ │ │ ├── @google-cloud/promisify@1.0.4 deduped
│ │ │ ├── google-auth-library@5.10.1 deduped
│ │ ├─┬ @google-cloud/paginator@2.0.3
│ │ ├── @google-cloud/projectify@1.0.4
│ │ ├── @google-cloud/promisify@1.0.4
│ │ ├── google-auth-library@5.10.1 deduped
│ │ ├─┬ google-gax@1.15.2
│ │ │ ├── google-auth-library@5.10.1 deduped
│ └─┬ google-auth-library@5.10.1
│   │ ├─┬ google-p12-pem@2.0.4

code is at github.com/icco/validator

IgorOrsag commented 4 years ago

We perform cca 2-3 K operations daily

Failed to add metadata entry A: Mon, 04 May 2020 09:46:10 GMT. Metadata key "a" contains illegal characters.
Screenshot 2020-05-04 at 11 51 13
yarn list | grep google
├─ @google-cloud/datastore@5.1.0
│  ├─ @google-cloud/projectify@^1.0.4
│  ├─ @google-cloud/promisify@^1.0.4
│  ├─ google-auth-library@^5.10.1
│  ├─ google-gax@^1.14.2
├─ @google-cloud/projectify@1.0.4
├─ @google-cloud/promisify@1.0.4
│  ├─ google-libphonenumber@^3.1.6
├─ google-auth-library@5.10.1
├─ google-gax@1.15.2
│  ├─ google-auth-library@^5.0.0
├─ google-libphonenumber@3.2.8
├─ google-p12-pem@2.0.4
│  ├─ google-p12-pem@^2.0.0
yarn list | grep grpc
│  ├─ @grpc/grpc-js@0.7.0
├─ @grpc/grpc-js@0.7.0
├─ @grpc/proto-loader@0.5.4
│  ├─ @grpc/grpc-js@^0.7.4
│  ├─ @grpc/grpc-js@0.7.9
│  ├─ @grpc/proto-loader@^0.5.1
murgatroid99 commented 4 years ago

To anyone who has already commented here, can you please comment if you encountered this message while running your code somewhere other than a Google Cloud Platform server (GCE, GKE, Cloud Run, Cloud Functions, Firebase Functions, etc.)?

nateq314 commented 4 years ago

Encountered in local dev here. Accessing Firestore via Firebase Admin SDK as part of a data migration:

Failed to add metadata entry A: Wed, 06 May 2020 12:35:02 GMT. Metadata key "a" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
murgatroid99 commented 4 years ago

@nateq314 Thank you.

Crispy1975 commented 3 years ago
14:14 $ npm list | grep google
│ ├─┬ @google-cloud/bigquery@4.7.0
│ │ ├─┬ @google-cloud/common@2.4.0
│ │ │ ├── @google-cloud/projectify@1.0.4
│ │ │ ├── @google-cloud/promisify@1.0.4 deduped
│ │ │ ├─┬ google-auth-library@5.10.1
│ │ │ │ │ ├─┬ google-p12-pem@2.0.4
│ │ ├─┬ @google-cloud/paginator@2.0.3
│ │ ├── @google-cloud/promisify@1.0.4
│ ├─┬ @google-cloud/spanner@5.4.0
│ │ ├─┬ @google-cloud/common@3.5.0
│ │ │ ├── @google-cloud/projectify@2.0.1 deduped
│ │ │ ├── @google-cloud/promisify@2.0.3 deduped
│ │ │ ├── google-auth-library@6.1.3 deduped
│ │ ├── @google-cloud/precise-date@2.0.3
│ │ ├── @google-cloud/projectify@2.0.1
│ │ ├── @google-cloud/promisify@2.0.3
│ │ ├─┬ google-auth-library@6.1.3
│ │ │ │ ├─┬ google-p12-pem@3.0.3
│ │ ├─┬ google-gax@2.9.2
│ │ │ │ ├─┬ google-auth-library@6.1.3
│ │ │ │ │ │ ├─┬ google-p12-pem@3.0.3
│ │ │ ├─┬ google-auth-library@6.1.3
│ │ │ │ │ ├─┬ google-p12-pem@3.0.3

We have a library that connects to both Spanner and BigQuery, however we see the following metadata error when making requests to Cloud Spanner. All of our environments are GKE based in different projects and regions. This error at the moment is specific to one region and GKE cluster.

Failed to add metadata entry A : Deadlock with higher priority transaction.. Metadata key "a " contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173

The relevant gRPC library versions follow:

npm list | grep grpc
│ │ │ ├─┬ @grpc/grpc-js@1.1.8
│ │ │ │ ├─┬ @grpc/proto-loader@0.6.0-pre9
│ │ │ ├─┬ @grpc/proto-loader@0.5.5
│ │ ├─┬ grpc-gcp@0.3.3
│ │ │ └── @grpc/grpc-js@1.1.8 deduped

Eventually our process goes OOM as follows:

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
XXXXXXXXXXXXXXXXXXXX <--- Last few GCs --->
XXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXX [17:0x55c26b98eee0]   873517 ms: Scavenge 510.2 (512.7) -> 509.5 (513.2) MB, 7.1 / 0.0 ms  (average mu = 0.142, current mu = 0.032) allocation failure
XXXXXXXXXXXXXXXXXXXX [17:0x55c26b98eee0]   875020 ms: Mark-sweep 510.3 (513.2) -> 509.6 (513.2) MB, 1310.8 / 0.0 ms  (+ 140.9 ms in 26 steps since start of marking, biggest step 73.2 ms, walltime since start of marking 1695 ms) (average mu = 0.240, current mu = 0.336) allocat
XXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXX <--- JS stacktrace --->
XXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXX ==== JS stack trace =========================================
XXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXX Security context: 0x35b08fc008d1 <JSObject>
XXXXXXXXXXXXXXXXXXXX     0: builtin exit frame: new Error(aka Error)(this=0x0c7129440591 <the_hole>,0x33a74b8d4cb9 <String[54]: 10 ABORTED: Deadlock with higher priority transaction.>,0x0c7129440591 <the_hole>)
XXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXX     1: ConstructFrame [pc: 0x55c268d0b65a]
XXXXXXXXXXXXXXXXXXXX     2: onReceiveStatus [0x1aa7c5a59d19] [/src/node_modules/@grpc/grpc-js/build/src/client.js:~167]

As mentioned this is one environment out of 6.

murgatroid99 commented 3 years ago

@Crispy1975 Thank you for the report. How frequently are you seeing that metadata error? And would you mind saying which GKE region is experiencing those errors?

Crispy1975 commented 3 years ago

@murgatroid99 apologies for the delayed response. We are seeing the messages every 20 minutes in one environment and less often in another, both on the GCP region us-west1. The Spanner clusters that are being communicated with are in europe-west4. All sorts of interesting characters...

E 2021-01-14T11:44:20.257951991Z Failed to add metadata entry A�������!���: Deadlock with higher priority transaction.. Metadata key "a�������!���" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173

E 2021-01-14T13:02:01.637458223Z Failed to add metadata entry A������� ��: 10. Metadata key "a������� ��" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173

Edit: I just double-checked some of our other envs and these also have the same errors, albeit less frequent. Other regions are: europe-west3, europe-west4, us-east4.

felipegcampos commented 3 years ago

We encountered this problem while moving from AWS NLB to AWS ALB with grpc support and migrating from grpc-node C++ to grpc-js in the server.

We're using @grpc/grpc-js@1.3.0

warning Resolution field "handlebars@4.5.3" is incompatible with requested version "handlebars@3.0.0"                                                                                       
├─ @grpc/grpc-js@1.3.0
├─ @grpc/proto-loader@0.5.2

I enabled the trace and logs in the server and for us, it seems to be failing the request every time it happens. Here's an example of a request from web service to data service.

May 04 17:17:25 web-service-b2e2a5ccd02d 2021-05-04T21:17:25.015Z | call_stream | Starting stream on subchannel 10.39.50.239:30051 with headers
May 04 17:17:25 web-service-b2e2a5ccd02d        grpc-accept-encoding: identity,deflate,gzip
May 04 17:17:25 web-service-b2e2a5ccd02d        accept-encoding: identity
May 04 17:17:25 web-service-b2e2a5ccd02d        :authority: core-25154-internal.staging.resolver.com:30051
May 04 17:17:25 web-service-b2e2a5ccd02d        user-agent: grpc-node-js/1.3.0
May 04 17:17:25 web-service-b2e2a5ccd02d        content-type: application/grpc
May 04 17:17:25 web-service-b2e2a5ccd02d        :method: POST
May 04 17:17:25 web-service-b2e2a5ccd02d        :path: /DataService.RolePermissionsController/add
May 04 17:17:25 web-service-b2e2a5ccd02d        te: trailers
May 04 17:17:25 web-service-b2e2a5ccd02d 2021-05-04T21:17:25.015Z | call_stream | [5309] attachHttp2Stream from subchannel 10.39.50.239:30051
May 04 17:17:25 web-service-b2e2a5ccd02d 2021-05-04T21:17:25.015Z | subchannel_refcount | 10.39.50.239:30051 callRefcount 0 -> 1
May 04 17:17:25 web-service-b2e2a5ccd02d 2021-05-04T21:17:25.015Z | call_stream | [5309] sending data chunk of length 594 (deferred)
May 04 17:17:25 web-service-b2e2a5ccd02d 2021-05-04T21:17:25.016Z | call_stream | [5309] calling end() on HTTP/2 stream
May 04 17:17:25 data-service-cee57b57d94b 2021-05-04T21:17:25.017Z | server | Received call to method /DataService.RolePermissionsController/add at address 0.0.0.0:30051
May 04 17:17:25 data-service-cee57b57d94b 2021-05-04T21:17:25.064Z | server_call | Request to method /DataService.RolePermissionsController/add ended with status code: OK details: OK
May 04 17:17:25 data-service-cee57b57d94b 2021-05-04T21:17:25.065Z | server_call | Request to method /DataService.RolePermissionsController/add stream closed with rstCode 0
May 04 17:17:25 web-service-b2e2a5ccd02d 2021-05-04T21:17:25.065Z | call_stream | [5309] Received server headers:
May 04 17:17:25 web-service-b2e2a5ccd02d        :status: 200
May 04 17:17:25 web-service-b2e2a5ccd02d        server: awselb/2.0
May 04 17:17:25 web-service-b2e2a5ccd02d        date: Tue, 04 May 2021 21:17:25 GMT
May 04 17:17:25 web-service-b2e2a5ccd02d        content-type: application/grpc+proto
May 04 17:17:25 web-service-b2e2a5ccd02d        grpc-accept-encoding: identity
May 04 17:17:25 web-service-b2e2a5ccd02d        grpc-encoding: identity
May 04 17:17:25 web-service-b2e2a5ccd02d 2021-05-04T21:17:25.068Z | call_stream | [5309] receive HTTP/2 data frame of length 1106
May 04 17:17:25 web-service-b2e2a5ccd02d 2021-05-04T21:17:25.068Z | call_stream | [5309] parsed message of length 1106
May 04 17:17:25 web-service-b2e2a5ccd02d 2021-05-04T21:17:25.069Z | call_stream | [5309] filterReceivedMessage of length 1106
May 04 17:17:25 web-service-b2e2a5ccd02d 2021-05-04T21:17:25.069Z | call_stream | [5309] pushing to reader message of length 1101
May 04 17:17:25 web-service-b2e2a5ccd02d 2021-05-04T21:17:25.069Z | call_stream | [5309] Received server trailers:
May 04 17:17:25 web-service-b2e2a5ccd02d         ’b6 V   us: 0
May 04 17:17:25 web-service-b2e2a5ccd02d        grpc-message: OK
May 04 17:17:25 web-service-b2e2a5ccd02d Failed to add metadata entry  ’b6 V   us: 0. Metadata key " ’b6 v   us" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
May 04 17:17:25 web-service-b2e2a5ccd02d 2021-05-04T21:17:25.069Z | call_stream | [5309] received status details string "OK" from server
May 04 17:17:25 web-service-b2e2a5ccd02d 2021-05-04T21:17:25.069Z | call_stream | [5309] ended with status: code=2 details="OK"
May 04 17:17:25 web-service-b2e2a5ccd02d 2021-05-04T21:17:25.069Z | subchannel_refcount | 10.39.50.239:30051 callRefcount 1 -> 0
May 04 17:17:25 web-service-b2e2a5ccd02d 2021-05-04T21:17:25.069Z | call_stream | [5309] close http2 stream with code 8
May 04 17:17:25 web-service-b2e2a5ccd02d 2021-05-04T21:17:25.070Z | call_stream | [5309] HTTP/2 stream closed with code 8
May 04 17:17:25 web-service-b2e2a5ccd02d error: grpc callMethod failed to call add with error:
May 04 17:17:25 web-service-b2e2a5ccd02d error:  message=2 UNKNOWN: OK, name=Error, code=2, details=OK, , , isOperational=true, stack=Error: 2 UNKNOWN: OK
May 04 17:17:25 web-service-b2e2a5ccd02d     at Object.callErrorFromStatus (/opt/resolver/node_modules/@grpc/grpc-js/build/src/call.js:31:26)
May 04 17:17:25 web-service-b2e2a5ccd02d     at Object.onReceiveStatus (/opt/resolver/node_modules/@grpc/grpc-js/build/src/client.js:176:52)
May 04 17:17:25 web-service-b2e2a5ccd02d     at Object.onReceiveStatus (/opt/resolver/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:336:141)
May 04 17:17:25 web-service-b2e2a5ccd02d     at Object.onReceiveStatus (/opt/resolver/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:299:181)
May 04 17:17:25 web-service-b2e2a5ccd02d     at /opt/resolver/node_modules/@grpc/grpc-js/build/src/call-stream.js:145:78
May 04 17:17:25 web-service-b2e2a5ccd02d     at processTicksAndRejections (internal/process/task_queues.js:79:11), code=2, details=OK, $ref=$["cause"]["metadata"]

Could it be something with AWS or something that should be addressed in the server/client?

murgatroid99 commented 3 years ago

That's new. This is the first instance I've seen of the grpc-status header name getting corrupted. That's why the call is failing when that happens.

We don't know why these errors happen. This issue is for information gathering, so that we can try to figure it out.

It looks like that client is talking to your own server through the ALB. Is that correct? Can you trigger that error with any degree of consistency? If so, can you additionally set the environment variable NODE_DEBUG=http2 and share the output from around where that error occurs? I don't actually know if it will provide any useful information, but it might.

felipegcampos commented 3 years ago

Yes, the server and the client are talking to each other through ALB. The problem is not consistent in the sense that it won't always happen with the same requests. However, it always happens somewhere when we run automation. We are using 12.20.1-alpine3.11 and here are the logs with the extra information you asked for:

May 05 11:51:03 object-service-a2aaac317c46 HTTP2 17: Http2Stream 385 [Http2Session server]: initiating response
May 05 11:51:03 object-service-a2aaac317c46 2021-05-05T15:51:03.889Z | server_call | Request to method /ObjectService.ObjectTypeLifeCycleController/retrieveAll ended with status code: OK details: OK
May 05 11:51:03 object-service-a2aaac317c46 HTTP2 17: Http2Stream 385 [Http2Session server]: sending trailers
May 05 11:51:03 web-service-cd3ec20b0aaf HTTP2 17: Http2Stream 477 [Http2Session client]: headers received
May 05 11:51:03 web-service-cd3ec20b0aaf HTTP2 17: Http2Stream 477 [Http2Session client]: emitting stream 'response' event
May 05 11:51:03 object-service-a2aaac317c46 HTTP2 17: Http2Stream 385 [Http2Session server]: closed with code 0, closed false, readable false
May 05 11:51:03 web-service-cd3ec20b0aaf 2021-05-05T15:51:03.890Z | call_stream | [674] Received server headers:
May 05 11:51:03 object-service-a2aaac317c46 HTTP2 17: Http2Stream 385 [Http2Session server]: destroying stream
May 05 11:51:03 web-service-cd3ec20b0aaf        :status: 200
May 05 11:51:03 object-service-a2aaac317c46 2021-05-05T15:51:03.890Z | server_call | Request to method /ObjectService.ObjectTypeLifeCycleController/retrieveAll stream closed with rstCode 0
May 05 11:51:03 web-service-cd3ec20b0aaf        server: awselb/2.0
May 05 11:51:03 web-service-cd3ec20b0aaf        date: Wed, 05 May 2021 15:51:03 GMT
May 05 11:51:03 web-service-cd3ec20b0aaf        content-type: application/grpc+proto
May 05 11:51:03 web-service-cd3ec20b0aaf        grpc-accept-encoding: identity
May 05 11:51:03 web-service-cd3ec20b0aaf        grpc-encoding: identity
May 05 11:51:03 web-service-cd3ec20b0aaf 2021-05-05T15:51:03.890Z | call_stream | [674] receive HTTP/2 data frame of length 440
May 05 11:51:03 web-service-cd3ec20b0aaf 2021-05-05T15:51:03.890Z | call_stream | [674] parsed message of length 440
May 05 11:51:03 web-service-cd3ec20b0aaf 2021-05-05T15:51:03.890Z | call_stream | [674] filterReceivedMessage of length 440
May 05 11:51:03 web-service-cd3ec20b0aaf 2021-05-05T15:51:03.890Z | call_stream | [674] pushing to reader message of length 435
May 05 11:51:03 web-service-cd3ec20b0aaf HTTP2 17: Http2Stream 477 [Http2Session client]: headers received
May 05 11:51:03 web-service-cd3ec20b0aaf HTTP2 17: Http2Stream 477 [Http2Session client]: emitting stream 'trailers' event
May 05 11:51:03 web-service-cd3ec20b0aaf 2021-05-05T15:51:03.891Z | call_stream | [674] Received server trailers:
May 05 11:51:03 web-service-cd3ec20b0aaf         ²f¨8V   us: 0
May 05 11:51:03 web-service-cd3ec20b0aaf        grpc-message: OK
May 05 11:51:03 web-service-cd3ec20b0aaf Failed to add metadata entry  ²f¨8V   us: 0. Metadata key " ²f¨8v   us" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
May 05 11:51:03 web-service-cd3ec20b0aaf 2021-05-05T15:51:03.892Z | call_stream | [674] received status details string "OK" from server
May 05 11:51:03 web-service-cd3ec20b0aaf 2021-05-05T15:51:03.892Z | call_stream | [674] ended with status: code=2 details="OK"
May 05 11:51:03 web-service-cd3ec20b0aaf 2021-05-05T15:51:03.892Z | subchannel_refcount | 10.39.50.239:30052 callRefcount 1 -> 0
May 05 11:51:03 web-service-cd3ec20b0aaf 2021-05-05T15:51:03.892Z | call_stream | [674] close http2 stream with code 8
May 05 11:51:03 web-service-cd3ec20b0aaf HTTP2 17: Http2Stream 477 [Http2Session client]: closed with code 0, closed true, readable true
May 05 11:51:03 web-service-cd3ec20b0aaf HTTP2 17: Http2Stream 477 [Http2Session client]: destroying stream
May 05 11:51:03 web-service-cd3ec20b0aaf 2021-05-05T15:51:03.893Z | call_stream | [674] HTTP/2 stream closed with code 8
May 05 11:51:03 web-service-cd3ec20b0aaf error: grpc callMethod failed to call retrieveAll with error:
May 05 11:51:03 web-service-cd3ec20b0aaf error:  message=2 UNKNOWN: OK, name=Error, code=2, details=OK, , , isOperational=true, stack=Error: 2 UNKNOWN: OK
May 05 11:51:03 web-service-cd3ec20b0aaf     at Object.callErrorFromStatus (/opt/resolver/node_modules/@grpc/grpc-js/build/src/call.js:31:26)
May 05 11:51:03 web-service-cd3ec20b0aaf     at Object.onReceiveStatus (/opt/resolver/node_modules/@grpc/grpc-js/build/src/client.js:176:52)
May 05 11:51:03 web-service-cd3ec20b0aaf     at Object.onReceiveStatus (/opt/resolver/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:336:141)
May 05 11:51:03 web-service-cd3ec20b0aaf     at Object.onReceiveStatus (/opt/resolver/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:299:181)
May 05 11:51:03 web-service-cd3ec20b0aaf     at /opt/resolver/node_modules/@grpc/grpc-js/build/src/call-stream.js:145:78
May 05 11:51:03 web-service-cd3ec20b0aaf     at processTicksAndRejections (internal/process/task_queues.js:79:11), code=2, details=OK, $ref=$["cause"]["metadata"]
May 05 11:51:03 web-service-cd3ec20b0aaf error: 210505/155103.829, [error] message: 2 UNKNOWN: OK, stack: Error: 2 UNKNOWN: OK
May 05 11:51:03 web-service-cd3ec20b0aaf     at /opt/resolver/packages/grpc/helpers.js:172:33
May 05 11:51:03 web-service-cd3ec20b0aaf     at tryCatcher (/opt/resolver/node_modules/bluebird/js/release/util.js:16:23)
May 05 11:51:03 web-service-cd3ec20b0aaf     at Promise._settlePromiseFromHandler (/opt/resolver/node_modules/bluebird/js/release/promise.js:547:31)
May 05 11:51:03 web-service-cd3ec20b0aaf     at Promise._settlePromise (/opt/resolver/node_modules/bluebird/js/release/promise.js:604:18)
May 05 11:51:03 web-service-cd3ec20b0aaf     at Promise._settlePromise0 (/opt/resolver/node_modules/bluebird/js/release/promise.js:649:10)
May 05 11:51:03 web-service-cd3ec20b0aaf     at Promise._settlePromises (/opt/resolver/node_modules/bluebird/js/release/promise.js:725:18)
May 05 11:51:03 web-service-cd3ec20b0aaf     at _drainQueueStep (/opt/resolver/node_modules/bluebird/js/release/async.js:93:12)
May 05 11:51:03 web-service-cd3ec20b0aaf     at _drainQueue (/opt/resolver/node_modules/bluebird/js/release/async.js:86:9)
May 05 11:51:03 web-service-cd3ec20b0aaf     at Async._drainQueues (/opt/resolver/node_modules/bluebird/js/release/async.js:102:5)
May 05 11:51:03 web-service-cd3ec20b0aaf     at Immediate.Async.drainQueues [as _onImmediate] (/opt/resolver/node_modules/bluebird/js/release/async.js:15:14)
May 05 11:51:03 web-service-cd3ec20b0aaf     at processImmediate **(internal/timers.js:461:21)**
murgatroid99 commented 3 years ago

Unfortunately, it doesn't look like those extra logs have any useful information, sorry. I do think your information is strong evidence that that this is an issue in Node itself, as opposed to some other server or proxy, but I don't even know where to start looking for the root cause, and I don't know how to get enough relevant information to file a solid, fixable issue for the Node core.

felipegcampos commented 3 years ago

Unfortunately, it doesn't look like those extra logs have any useful information, sorry. I do think your information is strong evidence that this is an issue in Node itself, as opposed to some other server or proxy, but I don't even know where to start looking for the root cause, and I don't know how to get enough relevant information to file a solid, fixable issue for the Node core.

We're running more tests to confirm and we had to upgrade a bunch of dependencies but it seems from the preliminary tests that it works with Node v14.16.1.


Update: In case anyone else experiences the same issue, I can confirm it is no longer a problem with Node v14.16.1.

HsinHeng commented 3 years ago

Unfortunately, it doesn't look like those extra logs have any useful information, sorry. I do think your information is strong evidence that this is an issue in Node itself, as opposed to some other server or proxy, but I don't even know where to start looking for the root cause, and I don't know how to get enough relevant information to file a solid, fixable issue for the Node core.

We're running more tests to confirm and we had to upgrade a bunch of dependencies but it seems from the preliminary tests that it works with Node v14.16.1.

Update: In case anyone else experiences the same issue, I can confirm it is no longer a problem with Node v14.16.1.

AWS EC2 Node: 10.16.3 @grpc/grpc-js: "^1.3.2", @grpc/proto-loader: "^0.6.2",

I have issue in Node.JS: 10.16.3

Failed to add metadata entry €ÒÓÔhUpt-ding: identity. Metadata key "€òóôhupt-ding" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
{"type":"OperationalError","message":"2 UNKNOWN: OK","code":2,"filename":"/src/node_modules/@grpc/grpc-js/build/src/call.js","line":31,"function":"Object.callErrorFromStatus"}

I recently upgrade node to 14.17.0 and try it.

lwmxiaobei commented 3 years ago

Node: v14.17.5 @grpc/grpc-js: "^1.3.1", @grpc/proto-loader: "^0.5.6"

Failed to add metadata entry  ![ŽUus: 0. Metadata key " ![Žuus" contains illegal characters. For more information see https://github.com/grpc/grpc-node/issues/1173
--
murgatroid99 commented 10 months ago

Closing because this has not been reported in over two years. The root cause has likely been fixed.

olealgoritme commented 3 months ago

Same issue, using "@google-cloud/pubsub": "^4.5.0"

�w.`��y���:��{w_:6������my" contains illegal characters. For more information see....