scylladb / scylladb

NoSQL data store using the seastar framework, compatible with Apache Cassandra
http://scylladb.com
GNU Affero General Public License v3.0
13.58k stars 1.29k forks source link

Move object_store test to dtest #16006

Open xemul opened 1 year ago

xemul commented 1 year ago

Cc @bhalevy @tchaikov

bhalevy commented 1 year ago

Since we can't do this atomically, I guess we need an issue in scylla-dtest to reimplement / copy the object_store test from scylla, and then an issue/PR here depending on the former, to delete the existing unit test.

tchaikov commented 11 months ago

but why? i don't object to moving object_store tests to dtest, but neither am i for this without reasonings. but working with two repos surely increases the overhead.

nyh commented 11 months ago

but why? i don't object to moving object_store tests to dtest, but neither am i for this without reasonings. but working with two repos surely increases the overhead.

I second this question. What is the motivation to move object_store specifically out of Scylla.git? Some comments:

  1. I believe we should have in scylla.git a test framework like dtest, which allows individual tests (not "subdirectories" or "suites") to configure a cluster as they wish and run tests on them. The current test/topology* is close, but not exactly there.
  2. One of the "features" of dtest is that it allows very slow tests. We could allow this in scylla.git too, requiring an option (like test/alternator/run's "--runveryslow" option) to run the very slow tests. The very slow tests would be run on Jenkins, but not when a developer wants to run some tests quickly.
  3. If object_store doesn't fit the current test.py methodology and you are frustrated by that, you don't have to be - some other test frameworks also don't. For example the Alternator tests currently run a script, test/alternator/run - and aren't understood by test.py. You can start this way and only integrate it closely with test.py when you think the time is right (if it will ever be right).
xemul commented 11 months ago

There's no strong reasoning as it's the part of -- how do we split unit tests from dtests that doesn't have the answer yet. We can continue debating here :)

My personal "split" is -- if the test doesn't need scylla sources to run (and compile?) it should sit in another repo. All unit/ and boost/ tests need it, so they are in correct place. Tests like cql-pytest don't need scylla sources and from my POV they are in unit tests for historical reasons, I'd move them too. Test speed doesn't matter here, there's simple_boot_shutdown test in dtest that just starts and stops scylla. It takes seconds, while tests listed in boost/suite.yaml::run_first each takes minutes (took minutes, maybe it changed recently). Of course we all like when tests run fast, but that's independent issue.

I don't buy the "it's inconvenient to use two repos" argument. I see it differently -- running dtest is nightmare nowadays, so quite a few people do it on their PRs relying on CI to do it for them. That's not nice. If we had to run some/most-off/all dtests as often as we launch test.py, we'd equipped dtest(s) with test.py already and it would be as convenient as running scylla/test.py. And the pre-CI test coverage would be increased.

Next point is about the framework to start and maintain scylla cluster. That's huge duplication of efforts. There's one in dtest (I'm not even sure it's in dtest, maybe it sits in scylla-ccm?), there another one in test.py. And there's a poor-man's version in cql-pytest/run.py that starts single node. And there's a wrapper on top of the last one in object_store that also adds the ability to send HUP to the guy. What's the point?

xemul commented 11 months ago

A personal reason for object_store test to be moved into dtest is -- it makes sense to run object_store cases against real AWS S3 bucket, not only minio, maybe a part of some longevity invocations, and I wouldn't like to add the AWS S3 configuration code to test.py and/or test/pylib/

nyh commented 11 months ago

There's no strong reasoning as it's the part of -- how do we split unit tests from dtests that doesn't have the answer yet. We can continue debating here :)

Have you seen my recent presentation on "testing by developers, for developers"? It explains why I believe we developers should be writing more tests, and doing this inside scylla.git conveniently - not in some inconvenient external repository.

My personal "split" is -- if the test doesn't need scylla sources to run (and compile?) it should sit in another repo. All unit/ and boost/ tests need it, so they are in correct place.

I don't agree with this reasoning. You are basically suggesting that only tests in C++ that need to be compiled should be in Scylla.git. This was the belief for around 5 years when Scylla started, and we only had C++ tests. And I think it was disproved by cql-pytest, test/cql, test/alternator, and the other frameworks which were introduced later and do the same thing that more than half of the C++ tests do (i.e., test CQL commands) but have many advantages over the C++ - which I tried to outline in my presentation.

Tests like cql-pytest don't need scylla sources and from my POV they are in unit tests for historical reasons, I'd move them too.

It is true that one of the nice things about test/cql-pytest and a couple of other test/ subdirectories is that they don't need the Scylla source code to run. In fact, you can even run them against some outside Scylla node or even AWS DynamoDB and Cassandra. But this doesn't change the fact that in 99% of the time, the developer who develops these tests want to run them against the latest master, and the developer of Scylla wants to run the tests against the Scylla binary he just compiled. It makes it super convenient to have them in the same repository. Another convenience of being in the same repository is the ability to send a pull-request which modifies both the code and the test - it is hard to underestimate how convenient this is. I spent the beginning of this week preparing a fix that had to modify three different repositories (scylla-jmx, scylla-tools-java and dtest), and it was hell. The point I was trying to make in my presentation was that writing tests should be fun - it should be something you want to do and you should write dozens or hundreds of tests because it's so fun - it shouldn't be a tedious chore that everyone tries to avoid.

Test speed doesn't matter here, there's simple_boot_shutdown test in dtest that just starts and stops scylla. It takes seconds

Good, so one less reason to wish it out of scylla.git ;-)

while tests listed in boost/suite.yaml::run_first each takes minutes (took minutes, maybe it changed recently). Of course we all like when tests run fast, but that's independent issue.

It's not just "we all like when tests run fast" - the problem is that if tests aren't fast, developers avoid running them. So if you're aiming for "tests for developers", they need to be fast. But as you said, this isn't relevant to your case if your test is fast anyway.

I don't buy the "it's inconvenient to use two repos" argument. I see it differently -- running dtest is nightmare nowadays, so quite a few people do it on their PRs relying on CI to do it for them. That's not nice. If we had to run some/most-off/~all~ dtests as often as we launch test.py, we'd equipped dtest(s) with test.py already and it would be as convenient as running scylla/test.py. And the pre-CI test coverage would be increased.

dtest is a very big project - around 85,000 lines of test code. That's not "huge" (our C++ util tests have a bit more lines, and the cql-pytest+alternator suites are almost as big, but still pretty big and very hard and slow to make big improvements to it. @fruch and others have made big improvements to dtest's framework, and recently I improved its startup time - so running a single test takes 10 seconds instead of 5 minutes. But it's still a nightmare to run the entire dtest suite on your laptop - it has hundreds of super-slow tests, tests which write a million partitions instead of 10 for god-knows-what reasons, and so on. It will take years to change any of this. So people don't run dtests unless they have to. They also don't develop new dtests very often. For example, looking at git blame, in all my years in Scylla, I wrote about 1100 lines of code in dtest, and more than 50,000 lines of code in scylla.git/test.

But more importantly, it truely is inconvenient to use two repos. You frequently need to send a patch which fixes a feature and its test together, and it's truely inconvenient. Backporting a feature often doesn't get its test backported. I think the better question is what do you gain by having two repos?

Next point is about the framework to start and maintain scylla cluster. That's huge duplication of efforts. There's one in dtest (I'm not even sure it's in dtest, maybe it sits in scylla-ccm?), there another one in test.py. And there's a poor-man's version in cql-pytest/run.py that starts single node. And there's a wrapper on top of the last one in object_store that also adds the ability to send HUP to the guy. What's the point?

If we had a good library for starting Scylla clusters (not ccm), we could avoid some of this duplication, but honestly - what kind of duplication are you worried about? The "poor-man's version in cql-pytest/run.py" is 450 lines of code, only 300 if you exclude comments. That's it! But more importantly, and I think people don't unsderstand this enough - the cql-pytest and alternator frameworks are not JUST about running a Scylla cluster and a test on it - you can also run the same test against a Cassandra cluster! Or AWS DynamoDB! Or an existing CQL-supporting node somewhere else (e.g., a couple of times I ran cql-pytests against Scylla Cloud to see if we forgot to backport something on the version they run). dtest isn't designed to allow that. For historic reasons, dtest supposedly could be run with Cassandra instead of Scylla, but nobody did this in years.

A personal reason for object_store test to be moved into dtest is -- it makes sense to run object_store cases against real AWS S3 bucket, not only minio, maybe a part of some longevity invocations, and I wouldn't like to add the AWS S3 onfiguration code to test.py and/or test/pylib/

I guess you have not seen test/alternator. When you run test/alternator/run, by default it indeed starts a Scylla cluster and runs the DynamoDB API tests against it. But, if you run "test/alternator/run --aws" it doesn't run Scylla at all - it connects to the real DynamoDB on AWS and runs the tests on it! There is almost zero "AWS configuration code" that needs to be there (in fact, it's easier on AWS, because it's already there and you don't need to run it). The run script definitely doesn't contain things like secret keys - this the developer needs to sent on his own in ~/.aws/credentials, and it's explained in test/alternator/README.md.

And yes, the need to add code to test.py and test/pylib is exactly why I was against the idea of these centralized monoliths in the first place. You are free to reject it as well, and put your code in your test or test directory - you don't have to add code to test/pylib.

xemul commented 11 months ago

There's no strong reasoning as it's the part of -- how do we split unit tests from dtests that doesn't have the answer yet. We can continue debating here :)

Have you seen my recent presentation on "testing by developers, for developers"? It explains why I believe we developers should be writing more tests, and doing this inside scylla.git conveniently - not in some inconvenient external repository.

Sure I did listen it, and I agree that writing tests by developers is the way to go. However, it has little to do with where tests are located. You actually supported it in your presentation and here's why.

First developer writes a test, that cuts in stone how things must work. I agree with that. Then it implements the functionality so that it passes the test. Agreed on that as well. But having the test in the same repo with the code it tests is implicitly says "OK, this is how this code works now, if it changes, don't forget to update the test too". It works in some cases, but not in all. (I elaborate on this a bit more below)

My personal "split" is -- if the test doesn't need scylla sources to run (and compile?) it should sit in another repo. All unit/ and boost/ tests need it, so they are in correct place.

I don't agree with this reasoning. You are basically suggesting that only tests in C++ that need to be compiled should be in Scylla.git.

Not quite, external tests can be in C++ too, but it's not always convenient

This was the belief for around 5 years when Scylla started, and we only had C++ tests. And I think it was disproved by cql-pytest, test/cql, test/alternator, and the other frameworks which were introduced later and do the same thing that more than half of the C++ tests do (i.e., test CQL commands) but have many advantages over the C++ - which I tried to outline in my presentation.

And I do agree with it all. However, cql-pytest tests some API that scylla provides. test/alternator does the same. So does the test/cql do. They all mustn't be tied to a specific version of scylla they test, so they can be in a separate repo. The cql-pytest can be run against C* , can it? Then it's not the essential part of Scylla. The test/alternator, as far as I understand, should pass on DynamyDB too, should it? If yes, same here -- it's a related, but separate project.

Tests like cql-pytest don't need scylla sources and from my POV they are in unit tests for historical reasons, I'd move them too.

It is true that one of the nice things about test/cql-pytest and a couple of other test/ subdirectories is that they don't need the Scylla source code to run. In fact, you can even run them against some outside Scylla node or even AWS DynamoDB and Cassandra. But this doesn't change the fact that in 99% of the time, the developer who develops these tests want to run them against the latest master, and the developer of Scylla wants to run the tests against the Scylla binary he just compiled. It makes it super convenient to have them in the same repository.

This means that these tests should have an easy-to-use set up that makes them run on $(pwd)/build/.../scylla binary using with one command. Why exactly becomes super convenient? What benefits does e.g. cql-pytest get if it "knows" that scylla is in ./build/whatever/scylla vs ../scylla/build/whatever/scylla?

Another convenience of being in the same repository is the ability to send a pull-request which modifies both the code and the test - it is hard to underestimate how convenient this is.

The need to send a coordinated fix for both -- test and code -- is bad thing by itself, because, as we told, we declare how things should work first, then we implement it (below this idea is elaborated)

I spent the beginning of this week preparing a fix that had to modify three different repositories (scylla-jmx, scylla-tools-java and dtest), and it was hell. The point I was trying to make in my presentation was that writing tests should be fun - it should be something you want to do and you should write dozens or hundreds of tests because it's so fun - it shouldn't be a tedious chore that everyone tries to avoid.

Test speed doesn't matter here, there's simple_boot_shutdown test in dtest that just starts and stops scylla. It takes seconds

Good, so one less reason to wish it out of scylla.git ;-)

while tests listed in boost/suite.yaml::run_first each takes minutes (took minutes, maybe it changed recently). Of course we all like when tests run fast, but that's independent issue.

It's not just "we all like when tests run fast" - the problem is that if tests aren't fast, developers avoid running them. So if you're aiming for "tests for developers", they need to be fast. But as you said, this isn't relevant to your case if your test is fast anyway.

Yes, test timing is not the splitting criteria

I don't buy the "it's inconvenient to use two repos" argument. I see it differently -- running dtest is nightmare nowadays, so quite a few people do it on their PRs relying on CI to do it for them. That's not nice. If we had to run some/most-off/~all~ dtests as often as we launch test.py, we'd equipped dtest(s) with test.py already and it would be as convenient as running scylla/test.py. And the pre-CI test coverage would be increased.

dtest is a very big project - around 85,000 lines of test code. That's not "huge" (our C++ util tests have a bit more lines, and the cql-pytest+alternator suites are almost as big, but still pretty big and very hard and slow to make big improvements to it. @fruch and others have made big improvements to dtest's framework, and recently I improved its startup time - so running a single test takes 10 seconds instead of 5 minutes. But it's still a nightmare to run the entire dtest suite on your laptop - it has hundreds of super-slow tests, tests which write a million partitions instead of 10 for god-knows-what reasons, and so on. It will take years to change any of this. So people don't run dtests unless they have to. They also don't develop new dtests very often. For example, looking at git blame, in all my years in Scylla, I wrote about 1100 lines of code in dtest, and more than 50,000 lines of code in scylla.git/test.

But more importantly, it truely is inconvenient to use two repos. You frequently need to send a patch which fixes a feature and its test together, and it's truely inconvenient.

I think that's one of the central points of the discussion. Let's take my "split" into consideration. If a test needs scylla sources to be built, then it sits in the same repo and fixing test and code is easy. Now let's assume there's a test that doesn't need scylla sources. What can it be? Let's take cql-pytest. What does it test? The public API scylla provides that's called CQL. What does "I need to fix core and test synchronously" means in that case? That the API gets broken. That's not great.

Where am I wrong here?

Backporting a feature often doesn't get its test backported. I think the better question is what do you gain by having two repos?

Several options:

  1. "Scylla cluster management for tests" code re-use. As I wrote we have two of them with quirks and extensions in scylla repo
  2. We very quickly will make dtest repo in a state that it is easy to use it, so in case someone needs to run a long dtest, it won't have to spend time wondering what had changed since he/she ran the test on previous pull
  3. Tests explicitly define how things work in scylla. With current approach only the fact that a PR doesn't touch test.py directory is the prove that the change is API-backward-compatible

Next point is about the framework to start and maintain scylla cluster. That's huge duplication of efforts. There's one in dtest (I'm not even sure it's in dtest, maybe it sits in scylla-ccm?), there another one in test.py. And there's a poor-man's version in cql-pytest/run.py that starts single node. And there's a wrapper on top of the last one in object_store that also adds the ability to send HUP to the guy. What's the point?

If we had a good library for starting Scylla clusters (not ccm), we could avoid some of this duplication, but honestly - what kind of duplication are you worried about? The "poor-man's version in cql-pytest/run.py" is 450 lines of code, only 300 if you exclude comments.

Recent work on object_store added some quirks to it. See, e.g. the #15565 discussion about how "convenient" the REST API in pylib/ is nowadays

That's it! But more importantly, and I think people don't unsderstand this enough - the cql-pytest and alternator frameworks are not JUST about running a Scylla cluster and a test on it - you can also run the same test against a Cassandra cluster!

I agree this is super cool but I see it as the reason to keep those tests away from scylla

Or AWS DynamoDB! Or an existing CQL-supporting node somewhere else (e.g., a couple of times I ran cql-pytests against Scylla Cloud to see if we forgot to backport something on the version they run). dtest isn't designed to allow that. For historic reasons, dtest supposedly could be run with Cassandra instead of Scylla, but nobody did this in years.

A personal reason for object_store test to be moved into dtest is -- it makes sense to run object_store cases against real AWS S3 bucket, not only minio, maybe a part of some longevity invocations, and I wouldn't like to add the AWS S3 onfiguration code to test.py and/or test/pylib/

I guess you have not seen test/alternator. When you run test/alternator/run, by default it indeed starts a Scylla cluster and runs the DynamoDB API tests against it. But, if you run "test/alternator/run --aws" it doesn't run Scylla at all - it connects to the real DynamoDB on AWS and runs the tests on it! There is almost zero "AWS configuration code" that needs to be there (in fact, it's easier on AWS, because it's already there and you don't need to run it). The run script definitely doesn't contain things like secret keys - this the developer needs to sent on his own in ~/.aws/credentials, and it's explained in test/alternator/README.md.

The object_store can do the same, and it's pretty simple to do it, but the bad thing is that e.g. object_store/conftest.py::S3_Server knows what it runs -- local minio or external S3 -- and that's not nice

And yes, the need to add code to test.py and test/pylib is exactly why I was against the idea of these centralized monoliths in the first place. You are free to reject it as well, and put your code in your test or test directory - you don't have to add code to test/pylib.

nyh commented 11 months ago

First developer writes a test, that cuts in stone how things must work. I agree with that. Then it implements the functionality so that it passes the test. Agreed on that as well. But having the test in the same repo with the code it tests is implicitly says "OK, this is how this code works now, if it changes, don't forget to update the test too". It works in some cases, but not in all. (I elaborate on this a bit more below)

Right - if the test is in the same repository as the code, if your code change breaks the test, either intentionally or unintentionally - it's the developer's responsibility to fix either the test or the code, so they work together. This is a feature, not a bug...

I don't agree with this reasoning. You are basically suggesting that only tests in C++ that need to be compiled should be in Scylla.git.

Not quite, external tests can be in C++ too, but it's not always convenient

Anything can be in an external repository. You can even Scylla Scylla into 10 different repositories - we partially did this, e.g., with Seastar - and can start splitting it further - perhaps one repository for materialized views, another for raft, and... But why? I think you nailed the reason why it should be a single repository: It's convenient to have one repository. And the nice thing about git is that it allows many different people from many different teams to work on the same repository. You don't have to split it.

And I do agree with it all. However, cql-pytest tests some API that scylla provides. test/alternator does the same. So does the test/cql do. They all mustn't be tied to a specific version of scylla they test, so they can be in a separate repo. The cql-pytest can be run against C* , can it? Then it's not the essential part of Scylla. The test/alternator, as far as I understand, should pass on DynamyDB too, should it? If yes, same here -- it's a related, but separate project.

All of this is true, and in fact I keep advocating doing all of this (running new Scylla tests on older Scylla versions, running Scylla tests on Cassandra or DynamoDB, etc.). But this is not the common thing you do. What you usually do most of the time, as a developer, is to edit the test and edit Scylla source code and run your new test on your new Scylla. You want that to be convenient. And then, you want it to be convenient to set a pull request to one - not 2 or 7 - repositories.

Every time I have a patch to Scylla which breaks a dtest, and I need to fix dtest too, it's a frustrating process usually requiring several steps (send a patch to dtest, then to Scylla, then another patch to dtest). And it's simply not needed if it's one repository.

This means that these tests should have an easy-to-use set up that makes them run on $(pwd)/build/.../scylla binary using with one command. Why exactly becomes super convenient? What benefits does e.g. cql-pytest get if it "knows" that scylla is in ./build/whatever/scylla vs ../scylla/build/whatever/scylla?

You're right that running the test is not the big difficulty. I do what you propose with dtest (I set it up once to run Scylla from ../scylla/build/dev/, or something like that, and then it's easy to run). The problem is how to the tests become separate from the code, it's hard to synchronize code and tests, hard to send pull requests, hard to backport code with its test, and so on.

Another convenience of being in the same repository is the ability to send a pull-request which modifies both the code and the test - it is hard to underestimate how convenient this is.

The need to send a coordinated fix for both -- test and code -- is bad thing by itself, because, as we told, we declare how things should work first, then we implement it (below this idea is elaborated)

In my talk, I didn't preach for "test-test methodology" - I said it was an option, and even then I emphasized the fact that even if you think you wrote all the tests up-front, in most cases while writing the code you'll discover new edge cases that you want to add tests for. When you fix existing code, things get even hairier, sometimes you discover that an old test was wrong - it "enshrined" wrong behavior. Alternatively, you may discover that the test was a too picky on error messages, and the error message changed. So you need to fix the test together with the code.

But more importantly, it truely is inconvenient to use two repos. You frequently need to send a patch which fixes a feature and its test together, and it's truely inconvenient.

I think that's one of the central points of the discussion. Let's take my "split" into consideration. If a test needs scylla sources to be built, then it sits in the same repo and fixing test and code is easy. Now let's assume there's a test that doesn't need scylla sources. What can it be? Let's take cql-pytest. What does it test? The public API scylla provides that's called CQL. What does "I need to fix core and test synchronously" means in that case? That the API gets broken. That's not great.

Where am I wrong here?

One thing you're wrong here is that often (too often, unfortunately) we "break" the API by fixing it - we correct incorrect behavior, and are no longer backward-compatible with the old API. I just fixed an example of this last week: We had "SELECT JSON t" return incorrect results when t had the type "time". We had two tests - one in test/boost and one in dtest, that enshrined the wrong behavior (both test frameworks don't run Cassandra, so never noticed the mistake). Fixing test/boost was easy, and I could do it together with my original patch. Fixing dtest was a mess.

Backporting a feature often doesn't get its test backported. I think the better question is what do you gain by having two repos?

Several options:

1. "Scylla cluster management for tests" code re-use. As I wrote we have two of them with quirks and extensions in scylla repo

Reuse what? Where? Please note that the dtest project doesn't even have its own cluster management - there is yet another separate repository that does this, "ccm". So if you move your test into dtest, you can indeed start to use ccm. And your test will become much slower and... not gain anything. So what the point?

You're right that in the past there was an idea to throw away ccm, scylla.git's scripts, and whatever QA uses in SCT - and replace all of them by one library that will work for all of them. This never actually materialized, and to be honest, I think in some cases, code reuse is overrated and brings more harm then good. Yes, I could have reused ccm in cql-pytest - and have test startup take 30 seconds. Instead I wrote my own code - 500 lines of code (much of them comments ;-)) - and cql-pytest startup take 0.3 seconds. If ccm was great, I would be happy to reuse it. But it was very very far from being great, and I didn't want to spend months or years battling to make it great (if it's at all possible, given all the crap it has) , when I could write what I really needed in just a few days.

If somebody writes a really great and lightweight CCM replacement, we could use it as a library in both scylla.git and dtest.git.

2. We very quickly will make dtest repo in a state that it _is_ easy to use it, so in case someone needs to run a long dtest, it won't have to spend time wondering what had changed since he/she ran the test on previous pull

You are welcome to try. You are even welcome to move object_store to dtest and see how it goes (the fact I'm saying here I'm not a fan of this idea, doesn't mean I want to stop you if you try!).

As I said, dtest is a big project, with a lot of historic baggage that span more than a decade, and it's not easy to make it much better. It has definitely become better over the last couple of years - thanks to big efforts by @fruch and others in QA (like converting from the horrible nosetest to pytest) and a small effort from me to speed it up - but it's still not fun to use it. Did you use dtest often? Do you develop dtests often?

3. Tests explicitly define how things work in scylla. With current approach only the fact that a PR doesn't touch test.py directory is the prove that the change is API-backward-compatible

I don't understand why this is true, or why putting tests in a separate repository changes this.

If we had a good library for starting Scylla clusters (not ccm), we could avoid some of this duplication, but honestly - what kind of duplication are you worried about? The "poor-man's version in cql-pytest/run.py" is 450 lines of code, only 300 if you exclude comments.

Recent work on object_store added some quirks to it. See, e.g. the #15565 discussion about how "convenient" the REST API in pylib/ is nowadays

That's what I love about non-monolythic code: You think pylib/ is bad? Don't use it. I don't use it in test/alternator or test/cql-pytest. Also test/cql doesn't use it (it doesn't even use code at all ;-)). And neither does test/boost. Yes, we have some duplication, like two copies of the Scylla-starting code. But to tell you the truth - I'm much less worried about code duplication than I have about test frameworks which are so inconvenient to use that nobody writes tests. This was the state of dtest for many years. It should be interesting to run "git blame" on dtest and see how many test lines each developer contributed to it - my guess (I didn't check) that the numbers will be very low. I'm also worried about being stuck with frameworks that don't have the features I need - e.g., test.py can start Scylla but it can't start Cassandra. That's a non-starter for cql-pytest (it should have also been the case for test/cql and test/topology, but that's another story).

xemul commented 11 months ago

First developer writes a test, that cuts in stone how things must work. I agree with that. Then it implements the functionality so that it passes the test. Agreed on that as well. But having the test in the same repo with the code it tests is implicitly says "OK, this is how this code works now, if it changes, don't forget to update the test too". It works in some cases, but not in all. (I elaborate on this a bit more below)

Right - if the test is in the same repository as the code, if your code change breaks the test, either intentionally or unintentionally - it's the developer's responsibility to fix either the test or the code, so they work together. This is a feature, not a bug...

I don't agree with this reasoning. You are basically suggesting that only tests in C++ that need to be compiled should be in Scylla.git.

Not quite, external tests can be in C++ too, but it's not always convenient

Anything can be in an external repository. You can even Scylla Scylla into 10 different repositories - we partially did this, e.g., with Seastar - and can start splitting it further - perhaps one repository for materialized views, another for raft, and... But why? I think you nailed the reason why it should be a single repository: It's convenient to have one repository.

Convenience is not always a good driver. Is is convenient to write single-line description to changes. It is convenient not to split changes into smaller pieces and to merge one huge patch instead. It is convenient for maintainers merge own PRs. It is convenient to merge PRs without review based purely on CI runs. But we don't do it.

And the nice thing about git is that it allows many different people from many different teams to work on the same repository. You don't have to split it.

That's two-fold. Git allows you to transparently have many repositories into one. Not as natively as "just one", of course.

And I do agree with it all. However, cql-pytest tests some API that scylla provides. test/alternator does the same. So does the test/cql do. They all mustn't be tied to a specific version of scylla they test, so they can be in a separate repo. The cql-pytest can be run against C* , can it? Then it's not the essential part of Scylla. The test/alternator, as far as I understand, should pass on DynamyDB too, should it? If yes, same here -- it's a related, but separate project.

All of this is true, and in fact I keep advocating doing all of this (running new Scylla tests on older Scylla versions, running Scylla tests on Cassandra or DynamoDB, etc.). But this is not the common thing you do. What you usually do most of the time, as a developer, is to edit the test and edit Scylla source code and run your new test on your new Scylla. You want that to be convenient. And then, you want it to be convenient to set a pull request to one - not 2 or 7 - repositories.

I agree that it's convenient.

Every time I have a patch to Scylla which breaks a dtest, and I need to fix dtest too, it's a frustrating process usually requiring several steps (send a patch to dtest, then to Scylla, then another patch to dtest). And it's simply not needed if it's one repository.

Why is that? From my perspective that's because dtest and scylla re "out of sync", so every time you "fix" dtest that fix should survive older version of scylla (without the fix) and newer version of scylla (with the fix) and then you can assume that scylla is all fixed now and the compatibility can be removed from dtest.

This means that these tests should have an easy-to-use set up that makes them run on $(pwd)/build/.../scylla binary using with one command. Why exactly becomes super convenient? What benefits does e.g. cql-pytest get if it "knows" that scylla is in ./build/whatever/scylla vs ../scylla/build/whatever/scylla?

You're right that running the test is not the big difficulty. I do what you propose with dtest (I set it up once to run Scylla from ../scylla/build/dev/, or something like that, and then it's easy to run). The problem is how to the tests become separate from the code, it's hard to synchronize code and tests, hard to send pull requests, hard to backport code with its test, and so on.

Another convenience of being in the same repository is the ability to send a pull-request which modifies both the code and the test - it is hard to underestimate how convenient this is.

The need to send a coordinated fix for both -- test and code -- is bad thing by itself, because, as we told, we declare how things should work first, then we implement it (below this idea is elaborated)

In my talk, I didn't preach for "test-test methodology" - I said it was an option, and even then I emphasized the fact that even if you think you wrote all the tests up-front, in most cases while writing the code you'll discover new edge cases that you want to add tests for. When you fix existing code, things get even hairier, sometimes you discover that an old test was wrong - it "enshrined" wrong behavior. Alternatively, you may discover that the test was a too picky on error messages, and the error message changed. So you need to fix the test together with the code.

OK, let's take the "error message changed" example. The need to fix test together with the code comes from the fact that dtest and scylla re out of sync. There are two patches -- one for scylla that fixes the error message (patch A) and the other one for dtest that changes the way this message is tested (patch B). Once either patch hits its repo tests start to fail, because pre-B tests don't know about A message and B-tests don't know about pre-A message. This can be solved the other way. For example -- add dtest as subrepo to scylla. When A comes in it carries the submodule update which includes B in the test.

But more importantly, it truely is inconvenient to use two repos. You frequently need to send a patch which fixes a feature and its test together, and it's truely inconvenient.

I think that's one of the central points of the discussion. Let's take my "split" into consideration. If a test needs scylla sources to be built, then it sits in the same repo and fixing test and code is easy. Now let's assume there's a test that doesn't need scylla sources. What can it be? Let's take cql-pytest. What does it test? The public API scylla provides that's called CQL. What does "I need to fix core and test synchronously" means in that case? That the API gets broken. That's not great. Where am I wrong here?

One thing you're wrong here is that often (too often, unfortunately) we "break" the API by fixing it - we correct incorrect behavior, and are no longer backward-compatible with the old API. I just fixed an example of this last week: We had "SELECT JSON t" return incorrect results when t had the type "time". We had two tests - one in test/boost and one in dtest, that enshrined the wrong behavior (both test frameworks don't run Cassandra, so never noticed the mistake). Fixing test/boost was easy, and I could do it together with my original patch. Fixing dtest was a mess.

OK, let's merge dtest(s) into scylla then? Why not?

Backporting a feature often doesn't get its test backported. I think the better question is what do you gain by having two repos?

Several options:

1. "Scylla cluster management for tests" code re-use. As I wrote we have two of them with quirks and extensions in scylla repo

Reuse what? Where? Please note that the dtest project doesn't even have its own cluster management - there is yet another separate repository that does this, "ccm".

I suspected that, but now you confirm. Yes, cluster management in ccm and cluster management in scylla/test/pylib/cluster.py are duplicates of each other. The former is probably worse/slower version of the latter, but still.

So if you move your test into dtest, you can indeed start to use ccm. And your test will become much slower and... not gain anything. So what the point?

You're right that in the past there was an idea to throw away ccm, scylla.git's scripts, and whatever QA uses in SCT - and replace all of them by one library that will work for all of them. This never actually materialized, and to be honest, I think in some cases, code reuse is overrated and brings more harm then good. Yes, I could have reused ccm in cql-pytest - and have test startup take 30 seconds. Instead I wrote my own code - 500 lines of code (much of them comments ;-)) - and cql-pytest startup take 0.3 seconds. If ccm was great, I would be happy to reuse it.

You're summarizing my point from the different angle. You say -- if ccm was great I would be happy to reuse it. My point is -- we're not using ccm because it's not great, instead we're writing a new poorman scylla-local ccm.

But it was very very far from being great, and I didn't want to spend months or years battling to make it great (if it's at all possible, given all the crap it has) , when I could write what I really needed in just a few days.

Yeah :(

If somebody writes a really great and lightweight CCM replacement, we could use it as a library in both scylla.git and dtest.git.

Why would anyone do it? E.g. -- I proposed to start this transition, but you're saying this makes no sense and it would be more convenient not to :D

2. We very quickly will make dtest repo in a state that it _is_ easy to use it, so in case someone needs to run a long dtest, it won't have to spend time wondering what had changed since he/she ran the test on previous pull

You are welcome to try. You are even welcome to move object_store to dtest and see how it goes (the fact I'm saying here I'm not a fan of this idea, doesn't mean I want to stop you if you try!).

As I said, dtest is a big project, with a lot of historic baggage that span more than a decade, and it's not easy to make it much better. It has definitely become better over the last couple of years - thanks to big efforts by @fruch and others in QA (like converting from the horrible nosetest to pytest) and a small effort from me to speed it up - but it's still not fun to use it. Did you use dtest often? Do you develop dtests often?

I use dtest roughly two times a month. When CI fails on my PR and I need to re-run some specific test with more logging or extra patching to fix it.

I didn't develop dtests yet. I know it's going to be painful.

3. Tests explicitly define how things work in scylla. With current approach only the fact that a PR doesn't touch test.py directory is the prove that the change is API-backward-compatible

I don't understand why this is true, or why putting tests in a separate repository changes this.

If we had a good library for starting Scylla clusters (not ccm), we could avoid some of this duplication, but honestly - what kind of duplication are you worried about? The "poor-man's version in cql-pytest/run.py" is 450 lines of code, only 300 if you exclude comments.

Recent work on object_store added some quirks to it. See, e.g. the #15565 discussion about how "convenient" the REST API in pylib/ is nowadays

That's what I love about non-monolythic code: You think pylib/ is bad? Don't use it. I don't use it in test/alternator or test/cql-pytest. Also test/cql doesn't use it (it doesn't even use code at all ;-)). And neither does test/boost. Yes, we have some duplication, like two copies of the Scylla-starting code. But to tell you the truth - I'm much less worried about code duplication than I have about test frameworks which are so inconvenient to use that nobody writes tests. This was the state of dtest for many years. It should be interesting to run "git blame" on dtest and see how many test lines each developer contributed to it - my guess (I didn't check) that the numbers will be very low. I'm also worried about being stuck with frameworks that don't have the features I need - e.g., test.py can start Scylla but it can't start Cassandra. That's a non-starter for cql-pytest (it should have also been the case for test/cql and test/topology, but that's another story).

nyh commented 11 months ago

OK, let's merge dtest(s) into scylla then? Why not?

Good question. I think the original reason for this split was that - I don't know if you realized this - that our dtest repository is actually a private repository - it's not public like scylla.git!

Another reason why this split was kept was that the developers of Scylla and dtest were mostly different groups of people (see "git blame"!) - Avi's core developers vs. Roy's test developers, and each group wanted to work by different processes. For example, believe it or not, scylla.git refused to work with pull requests (and used patches on a mailing list) until a few years ago, but dtest.git used pull request from the start. Today, scylla.git and dtest.git still have different CI, coding styles and pre-commit policies, created by different teams of people.

I suspected that, but now you confirm. Yes, cluster management in ccm and cluster management in scylla/test/pylib/cluster.py are duplicates of each other. The former is probably worse/slower version of the latter, but still.

What do you mean "but still"? If ccm sucks, why should I have used it in test/cql-pytest/run.py? Especially when I wrote a much better version (for my needs) in less than 500 lines of code and only a few days work? (by the way, I wrote it for testing Alternator initially).

Code reuse is not the only thing that matters. After creating test/{cql-pytest,alternator}/run, I wrote about 50,000 lines of tests for Alternator and CQL. What mattered to me is how easy and smooth this process - of developing tests - was. Not whether or not I reused an existing terrible library (ccm) or wrote 500 lines of code to replace it.

If somebody writes a really great and lightweight CCM replacement, we could use it as a library in both scylla.git and dtest.git.

Why would anyone do it? E.g. -- I proposed to start this transition, but you're saying this makes no sense and it would be more convenient not to :D

Maybe I did't understand what you're proposing. You said "move object_store test to dtest". I didn't realize you suggested to "modify object_store to use ccm instead of our home-brewed Scylla-starting scripts" or "improve ccm to be less of a disaster for object_store".

If your plan is to move object_store into dtest and then start working seriously on improving both dtest and ccm to make it a better framework for your test, then I wish you all the luck. I still think it will be inconvenient for you to work in a second repository, but I won't oppose your plan.

As I said, dtest is a big project, with a lot of historic baggage that span more than a decade, and it's not easy to make it much better. It has definitely become better over the last couple of years - thanks to big efforts by @fruch and others in QA (like converting from the horrible nosetest to pytest) and a small effort from me to speed it up - but it's still not fun to use it. Did you use dtest often? Do you develop dtests often?

I use dtest roughly two times a month. When CI fails on my PR and I need to re-run some specific test with more logging or extra patching to fix it.

I didn't develop dtests yet. I know it's going to be painful.

I think that before making assertions on dtest and ccm vs. other test framework, you need to have some experience with dtest and ccm. And before making decisions on whether or not it's ok or frustrating on sending patches to a separate dtest repository, you should have some experience with sending PRs to dtest.

You are welcome to try moving object_store into dtest, and experience all of this for yourself. I was probably too harsh in my comments above. I'm not against that you try this move to yourself, on the test suite that you are developing your self, and see how much you like it (or you don't). Sorry if I made it sound that way. I'm a strong believer that everyone should be allowed to follow their own favorite ways of working, and if you believe in moving your test code to dtest, please do so. I only wanted to warn you that a separate repository might not be all that you imagine it to be. But maybe you should, indeed, try it and see for yourself. Maybe you'll discover that you were right, and I was wrong.

fruch commented 11 months ago

given said all that, if you think you want to try out introducing any tests into dtest, be my guest, and I would do what I can to help out.

nyh commented 11 months ago

CCM is doing exactly what you would expected of it - people are trashing about it cause of information from 3 years ago.

That's possible. Anyway, some things I expected from CCM and weren't true when I looked at it some 4 years ago:

  1. I wanted a developer to easily use it, but needed to set a gazillion environment variables first.
  2. I wanted it to create a single Scylla node in 0.3 seconds, not 30 seconds.
  3. I wanted to be able to understand the code to improve it. CCM was over 8000 lines, and for some reason, every piece of code had several copies (one Cassandra version, one Scylla version, one "node" version and one "cluster" version, or something like that) and every time I wanted to modify something, I ended up modifyng the wrong thing and my change not having any affect :-)

I'm not saying we couldn't fix all those things. But as you know, it took years to fix that "single node startup time" in dtest/ccm (there was a lot of opposition to what was considered "hacks" to make tests faster) - and when I just did this myself from scratch in test/alternator, I had it working in 200 lines of shell-script code and 2 days of work (yes, the first version of the test/alternator/run script was a shell script, not Python, until Kostja rightly protested).

nyh commented 11 months ago
  • dtest is private for historical reasons, no one cloud told me exactly why it's not public.

I think the original thinking was that having tests public won't actually help the open-source users (users don't actually need to run tests), but can help competitors that might want to fork our code or Cassandra's code - so why make them public?

By the way, this "business" consideration is still true for scylla.git tests, and we sort of violated it with the scylla.git tests... For example, consider that I created a massive test suite for the DynamoDB API in test/alternator, which can now be run by any competitor that wants to add a DynamoDB-compatible API for their own database product! So not making it private was was a business mistake... But it was good "citizenship" in the open-source world - if we ever want to have external contributors, the tests should be public too. We had a few external contributors over the years, and they were always frustrated when their patch failed a dtest - but they couldn't even see the dtest because it's not public.

fruch commented 11 months ago

CCM is doing exactly what you would expected of it - people are trashing about it cause of information from 3 years ago.

That's possible. Anyway, some things I expected from CCM and weren't true when I looked at it some 4 years ago:

  1. I wanted a developer to easily use it, but needed to set a gazillion environment variables first.
  2. I wanted it to create a single Scylla node in 0.3 seconds, not 30 seconds.
  3. I wanted to be able to understand the code to improve it. CCM was over 8000 lines, and for some reason, every piece of code had several copies (one Cassandra version, one Scylla version, one "node" version and one "cluster" version, or something like that) and every time I wanted to modify something, I ended up modifyng the wrong thing and my change not having any affect :-)

The difference is ccm has lots of users, i.e. 2500 in dtest and lots of tests in drivers

One need to be careful when changing it, 4 years we didn't have a good way to validate a change in CCM with all of its usages. now we have better ways from dtest to run them all and see the affect.

I'm not saying we couldn't fix all those things. But as you know, it took years to fix that "single node startup time" in dtest/ccm (there was a lot of opposition to what was considered "hacks" to make tests faster) - and when I just did this myself from scratch in test/alternator, I had it working in 200 lines of shell-script code and 2 days of work (yes, the first version of the test/alternator/run script was a shell script, not Python, until Kostja rightly protested).

This is inaccurate as well, single node startup time is in dtest for few years. The issues we try the same hack for multiple nodes, and the fix for it was straight forward, and not some you have available in scylla unittest

I'm not sure what you are arguing with the 200 lines script, if you we using CCM it might be two liners, so what ? now you are missing other features we implemented in CCM, like picking up release version, for example.

In the end there are pros and cons to writing dtest vs. in repo tests. and really depends on what your tests are required todo, if it's more similar to tests already written in dtest, that easy.

Also in the end you can write in both, depends on the need.