redis / node-redis

Redis Node.js client
https://redis.js.org/
MIT License
16.85k stars 1.87k forks source link

Incorrect setReturnBuffers when parsing PubSub 'message' response #1870

Closed aleiby closed 2 years ago

aleiby commented 2 years ago

Since upgrading my redis server to 6.2.6 and redis client to 4.0.2 (from 4.0.1) I've been occasionally seeing the following crash: TypeError [ERR_INVALID_ARG_TYPE]: The "otherBuffer" argument must be an instance of Buffer or Uint8Array. Received type string ('message')

I've tracked this down to parser.optionReturnBuffers being set to false when it should be true causing the 'message' string to be returned as a string rather than a Buffer.

commands-queue.ts expects this to be a Buffer, and Buffer.equals throws the above NodeError.

This response comes in through parseResponse which calls this.#setReturnBuffers(). I haven't been able to track down why this works correctly most of the time, but occasionally winds up setting parser.optionReturnBuffers to false and blowing up. Hopefully someone more familiar with the code can spot what's going on.

Environment:

aleiby commented 2 years ago

Commenting out setReturnBuffers in shiftWaitingForReply avoids the crash for now. More details in a comment I left here: https://github.com/redis/node-redis/commit/77022209bde327d6a2d004dd07153bb6140d0ae2

I expect that will cause other problems since that change was made for a reason (which was unfortunately not described in the commit message).

leibale commented 2 years ago

@aleiby I haven't debugged it yet, but I'm working on a new RESP parser, which (hopefully) will solve this problem as well. Will keep you posted.

manikyafc commented 2 years ago

This issue is affecting our production as well, when should we expect a fix for this?

leibale commented 2 years ago

@aleiby @manikyafc I think that the changes in #1899 solve this problem, do you have a reproduction of the issue?

aleiby commented 2 years ago

It happens fairly reliably in my codebase, but I was unable to distill it down into an easily sharable repro unfortunately. You need to construct a scenario where parseResponse handles both a subscribe rely followed by a message reply in a single call.

manikyafc commented 2 years ago

Hi, thanks for the potential fix. I don't have steps to reproduce the issue, just that we can observe this issue in server logs/during testing on local from time to time, I can try cloning to local and test if the issue shows up again

binarymist commented 2 years ago

Same deal for us. This is the stack trace:

 node:buffer:816
     throw new ERR_INVALID_ARG_TYPE(
     ^

 TypeError [ERR_INVALID_ARG_TYPE]: The "otherBuffer" argument must be an instance of Buffer or Uint8Array. Received type string ('message')
     at new NodeError (node:internal/errors:371:5)
     at Buffer.equals (node:buffer:816:11)
     at JavascriptRedisParser.returnReply (/usr/src/app/node_modules/@node-redis/client/dist/lib/client/commands-queue.js:42:123)
     at JavascriptRedisParser.execute (/usr/src/app/node_modules/redis-parser/lib/parser.js:544:14)
     at RedisCommandsQueue.parseResponse (/usr/src/app/node_modules/@node-redis/client/dist/lib/client/commands-queue.js:194:71)
     at RedisSocket.<anonymous> (/usr/src/app/node_modules/@node-redis/client/dist/lib/client/index.js:336:83)
     at RedisSocket.emit (node:events:390:28)
     at Socket.<anonymous> (/usr/src/app/node_modules/@node-redis/client/dist/lib/client/socket.js:189:44)
     at Socket.emit (node:events:390:28)
     at addChunk (node:internal/streams/readable:324:12) {
   code: 'ERR_INVALID_ARG_TYPE'
 }

 Node.js v17.3.0

Call site: https://github.com/purpleteam-labs/purpleteam-orchestrator/blob/96fa97c7135130cd66d2d4a1e691f3cc84e32550/src/api/orchestration/subscribers/testerWatcher.js#L39

Happens every time the code is executed. Also a production service.

Currently blocked on this.

binarymist commented 2 years ago

Hi @leibale : I tried your branch as in:

Our package.json:

...
"dependencied": {
...
  "redis": "leibale/node-redis.git#parser",
...
}
...

and we get:

Error [ERR_MODULE_NOT_FOUND]: Cannot find package '/usr/src/app/node_modules/redis/' imported from /our/file.js
     at new NodeError (node:internal/errors:371:5)
     at legacyMainResolve (node:internal/modules/esm/resolve:312:9)
     at packageResolve (node:internal/modules/esm/resolve:857:14)
     at moduleResolve (node:internal/modules/esm/resolve:910:18)
     at defaultResolve (node:internal/modules/esm/resolve:1005:11)
     at ESMLoader.resolve (node:internal/modules/esm/loader:530:30)
     at ESMLoader.getModuleJob (node:internal/modules/esm/loader:251:18)
     at ModuleWrap.<anonymous> (node:internal/modules/esm/module_job:79:40)
     at link (node:internal/modules/esm/module_job:78:36) {
   code: 'ERR_MODULE_NOT_FOUND'
 }

 Node.js v17.3.0

Is this not supposed to be a drop-in fix? Thanks.

leibale commented 2 years ago

@binarymist first of all, thanks for helping! :)

Regarding the error, you can't npm install directly from GitHub, cause of typescript, which needs to be compiled/transpiled... You'll have to:

git clone --single-branch --branch parser https://github.com/leibale/node-redis
cd node-redis
npm install -ws
npm run build-all

then use npm link to link your local version

NormandoHall commented 2 years ago

Same here. Waiting the release that fixes this bug.

binarymist commented 2 years ago

Thanks @leibale. Not sure we're helping ;-), just need to get this fixed before we can release/publish.

Our project that consumes the redis NPM package that is experiencing this issue is run in a container.

I've added the following to the Dockerfile:

RUN cd /usr/src/ && \
  git clone --single-branch --branch parser https://github.com/leibale/node-redis && \
  cd node-redis && \
  npm install -ws && \
  npm run build-all && \
  npm link

package.json reverted to:

...
"dependencies": {
  ...
  "redis": "^4.0.2",
  ...
}
...

Still get the ERR_MODULE_NOT_FOUND error. What are we missing?

binarymist commented 2 years ago

@leibale: Which Pull Request is supposed to fix this, #1899 or the newer #2016 from @jhiesey? Any further thoughts on how to test these changes in the container? Is there an ETA on the fix? We currently have a large release blocked on this.

jhiesey commented 2 years ago

@binarymist I can't speak for the maintainers, but my PR is a quick fix for the immediate issue. #1899 is a rewrite of the parser and is a better long term solution but I suspect it's not quite ready.

The workaround I'm currently using is to install patch-package and add this patch file, which is equivalent to the change in #2016 at patches/@node-redis+client+1.0.4.patch:

diff --git a/node_modules/@node-redis/client/dist/lib/client/commands-queue.js b/node_modules/@node-redis/client/dist/lib/client/commands-queue.js
index 8d22295..d3e7807 100644
--- a/node_modules/@node-redis/client/dist/lib/client/commands-queue.js
+++ b/node_modules/@node-redis/client/dist/lib/client/commands-queue.js
@@ -279,7 +279,8 @@ _a = RedisCommandsQueue, _RedisCommandsQueue_maxLength = new WeakMap(), _RedisCo
 }, _RedisCommandsQueue_setReturnBuffers = function _RedisCommandsQueue_setReturnBuffers() {
     var _b, _c;
     __classPrivateFieldGet(this, _RedisCommandsQueue_parser, "f").setReturnBuffers(!!((_b = __classPrivateFieldGet(this, _RedisCommandsQueue_waitingForReply, "f").head) === null || _b === void 0 ? void 0 : _b.value.returnBuffers) ||
-        !!((_c = __classPrivateFieldGet(this, _RedisCommandsQueue_pubSubState, "f")) === null || _c === void 0 ? void 0 : _c.subscribed));
+        !!((_c = __classPrivateFieldGet(this, _RedisCommandsQueue_pubSubState, "f")) === null || _c === void 0 ? void 0 : _c.subscribed) ||
+        !!((_c = __classPrivateFieldGet(this, _RedisCommandsQueue_pubSubState, "f")) === null || _c === void 0 ? void 0 : _c.subscribing));
 }, _RedisCommandsQueue_shiftWaitingForReply = function _RedisCommandsQueue_shiftWaitingForReply() {
     if (!__classPrivateFieldGet(this, _RedisCommandsQueue_waitingForReply, "f").length) {
         throw new Error('Got an unexpected reply from Redis');
binarymist commented 2 years ago

That's a useful tool @jhiesey, although there seems to be another issue:

TypeError: nonBlockingClients[channel].rpush is not a function
    at file:///usr/src/app/src/api/orchestration/subscribers/testerWatcher.js:72:41
    at Function._RedisCommandsQueue_emitPubSubMessage (/usr/src/app/node_modules/@node-redis/client/dist/lib/client/commands-queue.js:231:9)
    at JavascriptRedisParser.returnReply (/usr/src/app/node_modules/@node-redis/client/dist/lib/client/commands-queue.js:43:123)
    at JavascriptRedisParser.execute (/usr/src/app/node_modules/redis-parser/lib/parser.js:544:14)
    at RedisCommandsQueue.parseResponse (/usr/src/app/node_modules/@node-redis/client/dist/lib/client/commands-queue.js:194:71)
    at RedisSocket.<anonymous> (/usr/src/app/node_modules/@node-redis/client/dist/lib/client/index.js:336:83)
    at RedisSocket.emit (node:events:390:28)
    at Socket.<anonymous> (/usr/src/app/node_modules/@node-redis/client/dist/lib/client/socket.js:189:44)
    at Socket.emit (node:events:390:28)
    at addChunk (node:internal/streams/readable:324:12)

Which I'm now getting earler at: https://github.com/purpleteam-labs/purpleteam-orchestrator/blob/0f77b4a9d31844629f323891b5b9ce7f9e7ce98e/src/api/orchestration/subscribers/testerWatcher.js#L72

This doesn't appear to be related to your patch @jhiesey, as when I npx patch-package --reverse and check that the change in node_modules/@node-redis/client/dist/lib/client/commands-queue.js within the container is removed I get the same error. I did update to node redis 4.0.4 . Maybe there is just another bug now @leibale?

I don't think I mssed anything in https://github.com/redis/node-redis/blob/HEAD/docs/v3-to-v4.md

binarymist commented 2 years ago

Ok, the next problem was casing of rpush, found the answer many pages through the issues list here.

Something is still broken though, I'm just not getting any errors now.

leibale commented 2 years ago

@binarymist change it to nonBlockingClients[channel].rPush (capital P)

binarymist commented 2 years ago

@binarymist change it to nonBlockingClients[channel].rPush (capital P)

Yip, got that one thanks.

Something is still broken though, I'm going to have to spend more time debugging.

digitalml commented 2 years ago

I see this issue as closed but I am getting this error for pub sub messages randomly every now and then. "The "otherBuffer" argument must be an instance of Buffer or Uint8Array." I've read through this closed issue but it's not clear to me How to resolve this? Can someone please point it out?

Thank you!

mifopen commented 2 years ago

@leibale Could you please explain why the issue was closed?

mifopen commented 2 years ago

Also, could anybody explain how such an exception could be caught? Currently, it's just crashing my server completely.

jhiesey commented 2 years ago

@digitalml @mifopen the issue is closed because the fix was merged to master. Unfortunately there hasn't been a release published that includes the fix; hopefully there will be one soon. Perhaps @leibale could comment on when.

You can use my workaround in https://github.com/redis/node-redis/issues/1870#issuecomment-1057530027 for the time being

alexzaporozhets commented 2 years ago

@leibale when do you plan to release the next version with this fix?

leibale commented 2 years ago

A new version with @jhiesey fix will be released tomorrow. Sorry it took so much time.

leibale commented 2 years ago

@alexzaporozhets @NormandoHall @digitalml @jhiesey @mifopen @binarymist version 4.0.6 is out! :) (do not try 4.0.5, I've made a mistake in the release process... :man_facepalming:)