redis-rb / redis-cluster-client

Redis cluster-aware client for Ruby
https://rubygems.org/gems/redis-cluster-client
MIT License
21 stars 9 forks source link

test_against_cluster_broken tests can't be added to #354

Closed KJTsanaktsidis closed 1 month ago

KJTsanaktsidis commented 7 months ago

The tests in test_against_cluster_broken.rb send a SHUTDOWN command to a redis node and validate how the client responds to this. These are great integration tests against real failure scenarios!

However, the tests have no way to actually bring that node back up afterwards, so it's impossible to add more test cases to that file; the second test case already has a missing node, the third test case will have two missing nodes, etc.

I'd like to refactor how these integration tests are set up to make them more self-contained and able to actually reset the cluster to its original state after every test case. This will enable me to add more test cases to the file.

What I'm thinking, roughly, is:

So the idea would be that you could run the tests like this:

$ bundle exec rake cluster:start # Starts a cluster, saves the port & pid details to a file in tmp/
$ bundle exec rake test # Runs the normal tests, using the just-started cluster
$ bundle exec rake test:integration # Runs the test_against_cluster_{scale,broken,state} tests, using the cluster from step 1 and restarting/re-creating it between each test case
$ bundle exec rake cluster:stop # Stop the cluster

As a bonus, this would make running the tests work on both MacOS & Linux the same way (at the moment, the dockerfiles are not usable on macos because the internal network is not accessible from mac, only the forwarded ports are (e.g. see https://github.com/redis/redis/issues/7460).

What do you think? Is this something I can begin work on?

supercaracal commented 7 months ago

Thank you for your proposal to enhance integration test cases.

I had acknowleged the difference of the docker network mode between Linux and MacOS. I decided to use docker compose bacause I don't want to build Redis server, GitHub Actions machine is Ubuntu and I'm using WSL2. But I know majority of engineers use MacOS. So as you said, current our development environment may be awkward.

Please give me some time to think about the follows.

KJTsanaktsidis commented 7 months ago

OK - let me know when you've got something you'd like to discuss. Happy to try out any way you think is good. I need to fix this so I can get the test in https://github.com/redis-rb/redis-cluster-client/pull/355 working :)

KJTsanaktsidis commented 6 months ago

Hey @supercaracal have you had any more thoughts about my plan to fix the test_against_cluster_broken tests? Is this something I can begin working on?

KJTsanaktsidis commented 3 months ago

@supercaracal I would like to come back to this at some point in the near future - would you be receptive to a PR along the lines of what's in my original message?

supercaracal commented 3 months ago

Hello, @KJTsanaktsidis

I apologize for my too late replying about this. I think it would be better to avoid a concurrent use of containers and binaries. So I'm still wondering about this change because it needs drastically modifications for the testing environment. Since this issue is low priority in my plan, I'm hard to take my time to review or to improve our test environment for a while. Although I'm keen to review or to enhance our test environment if it would be simpler and elegant anytime.

Sincerely,

KJTsanaktsidis commented 3 months ago

I think it would be better to avoid a concurrent use of containers and binaries

This is a fair point - I can make the test cases communicate with the Docker daemon (either directly over the Docker socket, or with the docker-compose CLI tool) and manage the redis instances that way. This would mean that you must run the tests with docker-compose though. Is that acceptable?

supercaracal commented 3 months ago

I think it may be an option to correspond this issue. Or DEBUG command may be useful.

https://github.com/redis/redis/blob/f60370ce28b946c1146dcea77c9c399d39601aaa/src/debug.c#L393-L497 enable-debug-command yes

KJTsanaktsidis commented 3 months ago

The DEBUG command doesn’t have a way to bring a server back up after shutting it down (other than with CRASH-AND-RECOVER, but that will be flaky because all it can do is recover after a certain timeout, not “the end of this test”).

I’ll try and put something together based on docker-compose this week.

supercaracal commented 2 months ago

Although they still doesn't work for macOS, I've organized integration test cases as the followings:

test senario
test/test_against_cluster_broken.rb failover, a part of nodes down, recovering
test/test_against_cluster_down.rb whole cluster down, rebuild
test/test_against_cluster_scale.rb scale out, scale in
test/test_against_cluster_state.rb resharding
supercaracal commented 1 month ago

The version 0.11.6 has been released. https://rubygems.org/gems/redis-cluster-client/versions/0.11.6

supercaracal commented 1 month ago

Although the other issue about the docker network mode per os still remained, I'll close this issue because the main problem was solved. Feel free to reopen it whenever some additional related problems should be resolved.