scylladb / scylla-jmx

Scylla JMX proxy
GNU Affero General Public License v3.0
29 stars 53 forks source link

Support `--primary-replica-only` option from `nodetool refresh` #192

Closed asias closed 1 year ago

asias commented 1 year ago

This is a follow up of commit 658818b2d0b91d56021b883cbfb3ed9b5797e548 to add the --primary-replica-only option to refresh.

tzach commented 1 year ago

@asias what interaction is there between --primary-replica-only and --load-and-stream? are they dependent? Mutually exclusive?

Also, please update https://github.com/scylladb/scylladb/blob/master/docs/operating-scylla/nodetool-commands/refresh.rst with the --primary-replica-only option

asias commented 1 year ago

@asias what interaction is there between --primary-replica-only and --load-and-stream? are they dependent? Mutually exclusive?

Also, please update https://github.com/scylladb/scylladb/blob/master/docs/operating-scylla/nodetool-commands/refresh.rst with the --primary-replica-only option

--primary-replica-only is useful only when --load-and-stream is used.

nyh commented 1 year ago

This patch only changes the JMX - it doesn't change the actual nodetool command or the documentation, and these things have to be done. If you can't do this in this patch because these are different repositories (I think this is a mistake, but it's an old mistake) please at least address the elephant in the room in the commit message - saying that you are also submitting a patch for nodetool and the documentation in parallel (and if possible, put links between the PRs to allow the reviewers to look at all of them).

asias commented 1 year ago

It is already referenced

here:

https://github.com/scylladb/scylla-tools-java/issues/282#issuecomment-1437718161

and

https://github.com/scylladb/scylla-tools-java/pull/327#issuecomment-1437717857

nyh commented 1 year ago

It is already referenced

Thanks, it was referenced the other way around. It would be nice to also have a reference here, with text. I still don't have a clear view without going into a rabbit hole whether you have a separate patch updating the documentation as Tzach requested, or not.

asias commented 1 year ago

It is already referenced

Thanks, it was referenced the other way around. It would be nice to also have a reference here, with text. I still don't have a clear view without going into a rabbit hole whether you have a separate patch updating the documentation as Tzach requested, or not.

We should update the doc after scylla.git updates the scylla-java-tools submodule to actually include the patch to nodetool. Otherwise we might end up with doc says something nodetool does not really support.

I saw Tzach added https://github.com/scylladb/scylladb/issues/12943

asias commented 1 year ago

Ping for merge.

nyh commented 1 year ago

Ping for merge.

I'll merge it as soon as I understand the grand plan or even the motivation. The text above doesn't mention an issue or explain what you're adding or why. Right now I don't so I don't feel comfortable reviewing this patch, and since nobody else reviewed it, I don't feel comfortable merging it.

To explain to me the grand plan, please write some text like: "The plan is to add ... to 'nodetool refresh' because [issue #xyz / Cassandra compatibility / whatever]. This patch begins the process by adding ... to JMX. After this patch is committed I plan to [write a patch / merge an already existing patch / whatever] for adding ... to nodetool and then I plan to [...] to update the documentation for this new feature".

Alternatively, if there's another reviewer more familiar with this context and doesn't need this text to understand what's going on, this reviewer can approve this PR and then I'll merge it without my own review. Everyone should remember that not only committers can (or should) do reviews.

asias commented 1 year ago

We need the jmx change in this PR to make nodetool refresh to support a new option --primary-replica-only (in this PR: https://github.com/scylladb/scylla-tools-java/pull/327).

asias commented 1 year ago

The new option is scylla only. Cassandra does not have the load and stream feature.

nyh commented 1 year ago

The new option is scylla only. Cassandra does not have the load and stream feature.

Please take a few minutes and write a whole paragraph answering all my questions, not one small question at a time. Either that, or find someone else familiar with this code to review this code and then I'll merge it.

nyh commented 1 year ago

The new option is scylla only. Cassandra does not have the load and stream feature.

Please take a few minutes and write a whole paragraph answering all my questions, not one small question at a time. Either that, or find someone else familiar with this code to review this code and then I'll merge it.

And @asias please don't assume I know everything you know when I consider that for merging. You said "The new option is scylla only. Cassandra does not have the load and stream feature.". But... wait... what? This commit seems to add a new option isPrimaryReplicaOnly, in addition to isLoadAndStream. It seems orthogonal, but now I understand it's not. Or is it?

Please write at least a paragraph what this feature is, WHY you are doing it (did someone ask for it? Is it for an issue? something??). As it stands I can't review this patch - even if the code is trivial.

asias commented 1 year ago

The new option is scylla only. Cassandra does not have the load and stream feature.

Please take a few minutes and write a whole paragraph answering all my questions, not one small question at a time. Either that, or find someone else familiar with this code to review this code and then I'll merge it.

And @asias please don't assume I know everything you know when I consider that for merging. You said "The new option is scylla only. Cassandra does not have the load and stream feature.". But... wait... what? This commit seems to add a new option isPrimaryReplicaOnly, in addition to isLoadAndStream. It seems orthogonal, but now I understand it's not. Or is it?

Please write at least a paragraph what this feature is, WHY you are doing it (did someone ask for it? Is it for an issue? something??). As it stands I can't review this patch - even if the code is trivial.

The feature and the motivation is explained here:

https://github.com/scylladb/scylla-tools-java/pull/327#issue-1592621993

asias commented 1 year ago

Who is supposed to own the scylla-jmx and scylla-tools-java subsystems?

mykaul commented 1 year ago

Who is supposed to own the scylla-jmx and scylla-tools-java subsystems?

Unsure, @avelanarius ?

denesb commented 1 year ago

I think scylla-tools-java.git and scylla-jmx.git doesn't have their own maintainers, instead scylla.git maintainers merge PRs in these repositories too. This creates a problem as scylla.git maintainers are usually familiar with neither java nor the codebases in these repositories. So it is quite hard to get PRs merged here.

avikivity commented 1 year ago

I think scylla-tools-java.git and scylla-jmx.git doesn't have their own maintainers, instead scylla.git maintainers merge PRs in these repositories too. This creates a problem as scylla.git maintainers are usually familiar with neither java nor the codebases in these repositories. So it is quite hard to get PRs merged here.

They can ask for help from language experts. But usually (as in this case) language expertise is not needed.

asias commented 1 year ago

Who is supposed to own the scylla-jmx and scylla-tools-java subsystems?

Unsure, @avelanarius ?

In the past, I suggested the scylla manager team to own the nodetool tools, since they are all management tools. It did not happen though. It is a pity that we do not even have a team to officially own a tool that is used by all the users.

nyh commented 1 year ago

I think scylla-tools-java.git and scylla-jmx.git doesn't have their own maintainers, instead scylla.git maintainers merge PRs in these repositories too. This creates a problem as scylla.git maintainers are usually familiar with neither java nor the codebases in these repositories. So it is quite hard to get PRs merged here.

I have no problem with either Java, or the code base. I just have a problem with random PRs that change something in a new feature I'm not familiar with, without giving me any context or motivation for this change. It's even worse when the change is spread out over 3 different repositories, sending me on a wild-goose-chase of links between different repositories, and 100-comment issues where the 70th comment explains a part of what I need to know.

The solution is simple: Write good commit messages. The commit messages should be self-contained. Explain the feature that is being fixed in more than one word. Give links for extra information, but not the basic information about the feature or the patch. Explain the motivation of this change, don't assume I can guess the motivation. Explain everything that needs to be explained in the commit message - don't send me to read obscure comments in other issues. And if the change involves three different github repositories, explain in each commit message exactly how this change fits with the rest. Everything I wrote here is true in general for all our repositories, but is doubly true for scylla-jmx et al. which rarely get patches so the reviewers need context even more than usual.

Finally, another request from everyone reading this: Please always remember that everybody can review, not just maintainers. If you developed this feature with someone else, or your manager asked you to develop this feature, please get this person (who knows your motivation even if you don't explain it well) to review your patch. As a maintainer when I see a patch I don't fully understand but that seems straightforward enough and it has been reviewed by someone else knowledgeable in this project, I tend to commit this patch without making a big fuss. I only make a fuss when I'm asked to be the only reviewer of the patch - and don't understand it.

P.S. Adding a reference to an issue (Refs, or better yet Fixes) is also a great way (perhaps the best way) to add some context to a PR.

asias commented 1 year ago

@nyh The original commit message reference the previous commit which does the similar thing like in this PR.

This is a follow up of commit https://github.com/scylladb/scylla-jmx/commit/658818b2d0b91d56021b883cbfb3ed9b5797e548 to add the --primary-replica-only option to refresh.

It links to both scylla feature and scylla-tools-java. Do you read it?

commit 658818b2d0b91d56021b883cbfb3ed9b5797e548
Author: Juliusz Stasiewicz <juliusz.stasiewicz@scylladb.com>
Date:   Mon Aug 30 17:54:05 2021 +0200

    Support `--load-and-stream` option from `nodetool refresh`

    This information is translated to {"load_and_stream", "true"} entry in the
    POST request to Scylla's HTTP API at `storage_service/sstables/{keyspace}`
    endpoint.

    More about this feature: scylladb/scylla#7846

    This change is a consequence of scylladb/scylla-tools-java#253.

I am not sure why you think no context or motivation is given.

In addition to to origin commit message.

I also referenced the relevant scylla-tools-java issue and PR.

https://github.com/scylladb/scylla-tools-java/issues/282#issuecomment-1437718161 https://github.com/scylladb/scylla-tools-java/pull/327#issuecomment-1437717857 .

For this specific change, only two repos are involved, scylla-tools-java.git and scylla-jmx.git.

It would be much easier for you if you look at the scylla-tools-java.git part of this change. It explains how this option is being used.

This information is forwarded through JMX to Scylla, adding an extra {"primary_replica_only", "true"} entry in the POST request to Scylla's HTTP API at storage_service/sstables/{keyspace} endpoint.

More about this feature: https://github.com/scylladb/scylladb/pull/7846

This is a follow up of commit https://github.com/scylladb/scylla-tools-java/commit/3b378f7095e3cd8b776fd188a68234431dfe4d70.

$ nodetool refresh -las -pro ks2 standard1
Loading new SSTables for keyspace=ks2, table=standard1, load_and_stream=true, primary_replica_only=true

$ nodetool refresh -las ks2 standard1
Loading new SSTables for keyspace=ks2, table=standard1, load_and_stream=true, primary_replica_only=false

$ nodetool refresh ks2 standard1
Loading new SSTables for keyspace=ks2, table=standard1, load_and_stream=false, primary_replica_only=false

Fixes https://github.com/scylladb/scylla-tools-java/issues/282

Let's forget about how you got confused previously for a moment. If you still got confused now, let's have a call to solve this out and proceed.

tzach commented 1 year ago

@asias @nyh ping