"eval" within "multi" crashes DF

boazsade commented 1 year ago

Describe the bug When running pipeline test from ioredis, we are getting stuck. This is a result of a pipeline test that execute the commands: set, eval and get. The end result is that we are not returning to the client and DF cannot be shutdown as it is stuck on a command that never completed.

To Reproduce Steps to reproduce the behavior: 1.multi

set foo asdf
EVAL "return {KEYS[1],KEYS[2],ARGV[1],ARGV[2]}" 2 key1 key2 first second
get foo
exec This command would not "return", the connection will stay opened even when the connection is closed on the Redis, client side. You would not be able to shutdown DF.

Expected behavior This should return a result for the client:

1) OK
2) 1) "key1"
   2) "key2"
   3) "first"
   4) "second"
3) "asdf"

And we should be able to shutdown DF after that.

Environment (please complete the following information):

OS: ubuntu 22.04
Kernel:Linux dfly2 5.15.0-52-generic #58-Ubuntu SMP Thu Oct 13 08:03:55 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Bare Metal
Dragonfly Version: 0.10.0

Reproducible Code Snippet

it("should be supported in transaction blocks", (done) => {
      redis
        .pipeline()
        .multi()
        .set("foo", "asdf")
        .echo("bar", "baz", "123", "abc")
        .get("foo")
        .exec()
        .exec(function (err, results) {
          expect(err).to.eql(null);
          expect(results[4][1][1]).to.eql(["bar", "baz", "123", "abc"]);
          expect(results[4][1][2]).to.eql("asdf");
          done();
        });
    });
  });

Additional context This code would work:

 it("should support callbacks", (done) => {
      let pending = 1;
      redis
        .pipeline()
        .echo("foo", "bar", "123", "abc", function (err, result) {
          pending -= 1;
          expect(err).to.eql(null);
          expect(result).to.eql(["foo", "bar", "123", "abc"]);
        })
        .exec(function (err, results) {
          expect(err).to.eql(null);
          expect(results).to.eql([[null, ["foo", "bar", "123", "abc"]]]);
          expect(pending).to.eql(0);
          done();
        });
    });

boazsade commented 1 year ago

Once this is resolved, please update the list of supported tests in ioredis

boazsade commented 1 year ago

This is triggered by "transaction.cc:420] Check failed: 0u == txid_ (0 vs. 1)" To reproduce this the only requirement is to do:

MULTI
EVAL "return {KEYS[1],KEYS[2],ARGV[1],ARGV[2]}" 2 key1 key2 first second
EXEC This would trigger the crash. When in release mode, and there are more commands in the transaction it would not crash but hung

romange commented 1 year ago

Thanks Boaz. Can you open a branch Bug457 that adds this test to dragonfly_test?

On Mon, Nov 7, 2022, 10:09 Boaz Sade @.***> wrote:

This is triggered by "transaction.cc:420] Check failed: 0u == txid_ (0 vs. 1)" To reproduce this the only requirement is to do:

MULTI

EVAL "return {KEYS[1],KEYS[2],ARGV[1],ARGV[2]}" 2 key1 key2 first second

EXEC This would trigger the crash. When in release mode, and there are more commands in the transaction it would not crash but hung

— Reply to this email directly, view it on GitHub https://github.com/dragonflydb/dragonfly/issues/457#issuecomment-1305227838, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4BFCEVAN5ULW72QQI5CN3WHC2KPANCNFSM6AAAAAARVEDIBA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

boazsade commented 1 year ago

yes

boazsade commented 1 year ago

The issue is triggered by running transaction within transaction. This happens because we have:

EXEC is starting a transaction (in the context of multi)
Then from within this transaction we scheduling transaction operation from the script context, This calls the function "ScheduleInternal" which verifies that its the a clean transaction state (transaction id == 0). But this is not true any more, since we already pass through this function at step 1.

The fix for this would require supporting this type of flow - starting a context of transaction and then running a script that requires its own context.

For now I will try to issue a workaround that would fail this operation (would not allow for running script inside "multi"). so the result would not be either a crash or that the context is hung and the the application cannot even shutdown as it is now. I will also add a test to verify for now that we are getting an error and not other issue, and once this is fixed, we would see that this test would fail

romange commented 1 year ago

Fixed

boazsade commented 1 year ago

Just note that this is a temporary fix - we have issue #467 as a flow-up for it. There is also a branch issue-467, to work on it

dragonflydb / dragonfly

"eval" within "multi" crashes DF #457