Simple local mode as developer friendly tooling

Totktonada commented 3 years ago

I have several ideas how to make the tool more friendly for developers. I assume that a user indend to:

Reproduce a problem from CI in a local environment to debug it.
Change test parameters and maybe the test itself and receive a fast feedback.
Add a test and perform some debugging locally at early development stage of the test.

We can pick the lowest hanging fruits:

Don't perform any dependencies installation prodecures.
Don't run any nemesises in this local mode.
Assume that tarantool is in PATH and run instances using it (anyone can easily modify PATH).
Accept a list of URIs and don't start any instances at all (may be useful to, say, control instances manually from other terminals).

ligurio commented 3 years ago

https://github.com/jepsen-io/jepsen/tree/main/docker

Totktonada commented 3 years ago

There are pros and cons. Docker is convenient to reproduce a problem locally, when we consider tarantool as a black box. And I guess that it allows to run some nemesises. However injecting some patched tarantool executable requires many manual steps (and likely rebuilding from scratch on each run). I think that ability to run the testing with tarantool on the host machine is important for debugging tarantool problems.

ligurio commented 3 years ago

I would at first ask people from scaling team about problems that they have with reproducing bugs found by Jepsen tests. Otherwise, it is a solving of non-existing problems.

Totktonada commented 3 years ago

I have considerable amount of experience with reproducing and fixing bugs around tarantool (including ones that appear only on the customer side, ones that may be reproduced only in certain environment, ones that are triggered by a rare condition, etc), so I don't think that we should throw out my suggestions as something about 'non-existing problem'.

If you doubt about a particular point from my proposal, let's explain, why do you think that it does not give any real profit or ask a question if the profit is not obvious. But, please, don't say that I'm not the right person to have adequate vision how to reproduce and debug problems.

(AFAIK, Scaling Team never tries to feed workloads using the tool and I'm quite sure that the main reason is that it is not simple to get started. So unlikely we'll receive a good feedback. Uroboros. That's the thing that we should decide between us.)

It seems, I should explain my points a bit.

Debugging is the iterative process and it is much easier, when you work on your developer machine in your convenient environment rather than, when you build and deploy each new fprintf(stderr, "AAA\n"); using some non-trivial process. A large delay between iterations slows down it on the order of magnitude or worse.

Or, say, about using Docker (that's local, so what a problem?). Here we have the same problem about delivering tarantool executable. We should setup some (volume based?) way to edit sources on the host machine and rebuilt them within docker / to use in docker (and don't be hit by those weird problems with root owned files). More setup, more steps to test your debug print.

Or, for example, you need to attach strace / gdb, but all the preliminary work is done in a non-privileged docker container, so what — start from scratch? (I know the workaround with adding ptrace capacity into container's json config, but I would not expect that it is obvious for everyone.)

So we should be able to feed workload into tarantool on given URI or tarantool present in PATH. It must have for a testing harness that aims to test tarantool. We can provide other ways (docker, cloud, ...), but it is not for investigating problems — at least most of problems should not require it.

You was the person who push us toward the randomized stress testing with deep historical analysis. I'm sad that it may be thrown out just because we didn't think how to make the harness convenient for a developer and didn't add several popen calls.

It is a way simpler than the cloud based solution that is already implemented.

(Everything said above is said in assumption that we'll want to use this tool for jepsen way testing. We'll anyway have Clojure and doubts about its readability. There are other ways to implement the same idea and I don't mind discussing them. But the points above are actual for any implementation.)

tarantool / jepsen.tarantool

Simple local mode as developer friendly tooling #97