Reaper endpoints: Async Repair Endpoint

Miles-Garnsey commented 1 year ago

This PR modifies the existing repair method in NodeOpsProvider to accept arguments suitable for Reaper. When complete, it should also create a new v2 endpoint to accept new repairs and return a repairID which can later be queried for status updates.

Fixes issue https://github.com/riptano/mission-control/issues/506

github-actions[bot] commented 1 year ago

No linked issues found. Please add the corresponding issues in the pull request description.
Use GitHub automation to close the issue when a PR is merged

Miles-Garnsey commented 1 year ago

@emerkle826 this is draft still, but I have a few questions and thought we might best collaborate with some concrete code to look at.

Questions:

Can the management API RPC mechanism handle polymporphic functions? We might need one which accepts a RingRange and one which accepts beginToken and endToken arguments, since you can repair non-contiguous ranges via the former. I know we've encountered issues with this in the past with the OpenAPI generator, so wanted to see what the best way to tackle it was.
Can you pass custom types into the RPC methods? The current methods seem to use only simple native Java types. Do they need to be registered as UDTs to make this work, or can you use Java types directly?
If we can't use polymorphism here to adapt to which signature we want to call, would it be reasonable to pass nulls instead?

The only place this matters is in the definition of the ranges, since the existing function just takes a start and end token, while the new function needs to take a list of start and end tokens (which is realised as a RingRange, but we could potentially use a List<Tuple> or List<List<BigInteger>> in the worst case).

emerkle826 commented 1 year ago

Can the management API RPC mechanism handle polymporphic functions? We might need one which accepts a RingRange and one which accepts beginToken and endToken arguments, since you can repair non-contiguous ranges via the former. I know we've encountered issues with this in the past with the OpenAPI generator, so wanted to see what the best way to tackle it was.

The RPC calls don't handle polymorphism (see here). RPC Methods have to have a unique RPC name, so we can't even register the same method with 2 different RPC names to handle a polymorphic situation.

emerkle826 commented 1 year ago

2. Can you pass custom types into the RPC methods? The current methods seem to use only simple native Java types. Do they need to be registered as UDTs to make this work, or can you use Java types directly?

I think we can, but I'm not sure. To do so may require enhancements to some of the RPC serialization code.

emerkle826 commented 1 year ago

3. If we can't use polymorphism here to adapt to which signature we want to call, would it be reasonable to pass nulls instead?

Yes, but I think in this case it might be better to have the API side of things determine which Range type is being requested and then build a specific RPC call for a RingRange or a start/end token. Unless I'm missing something, it seems like we will have 2 RPC methods on the Agent side, and the API/server side will handle the request and determine which of the RPC methods to call.

Miles-Garnsey commented 1 year ago

Got confused about where the beginToken/endToken vs RingRange thing was happening. It was in Reaper, not mgmt API. So we can stick with having a single repair method. That solves one problem.

I've revamped the existing RPC method, and I have the old Resource calling it now.

Miles-Garnsey commented 1 year ago

@emerkle826 can you give me a hand with this when you're in? I'm getting the below statement about illegally formatted files from CI:

Error:  To fix formatting errors, run "mvn com.coveo:fmt-maven-plugin:format"
Error:  Non complying file: /home/runner/work/management-api-for-apache-cassandra/management-api-for-apache-cassandra/management-api-agent-common/src/main/java/com/datastax/mgmtapi/rpc/models/RingRange.java
Error:  Non complying file: /home/runner/work/management-api-for-apache-cassandra/management-api-for-apache-cassandra/management-api-agent-common/src/main/java/com/datastax/mgmtapi/NodeOpsProvider.java
Error:  Failed to execute goal com.coveo:fmt-maven-plugin:2.9:check (default) on project datastax-mgmtapi-agent-common: Found 2 non-complying files, failing build -> [Help 1]
Error:

But when I run the suggested command it doesn't appear to change any files, or come up with any further information about what is wrong with the formatting. Do you know what I'm supposed to do with this?

emerkle826 commented 1 year ago

@emerkle826 can you give me a hand with this when you're in? I'm getting the below statement about illegally formatted files from CI:
Error:  To fix formatting errors, run "mvn com.coveo:fmt-maven-plugin:format"
Error:  Non complying file: /home/runner/work/management-api-for-apache-cassandra/management-api-for-apache-cassandra/management-api-agent-common/src/main/java/com/datastax/mgmtapi/rpc/models/RingRange.java
Error:  Non complying file: /home/runner/work/management-api-for-apache-cassandra/management-api-for-apache-cassandra/management-api-agent-common/src/main/java/com/datastax/mgmtapi/NodeOpsProvider.java
Error:  Failed to execute goal com.coveo:fmt-maven-plugin:2.9:check (default) on project datastax-mgmtapi-agent-common: Found 2 non-complying files, failing build -> [Help 1]
Error:  
But when I run the suggested command it doesn't appear to change any files, or come up with any further information about what is wrong with the formatting. Do you know what I'm supposed to do with this?

Just to close the loop here, you can run mvn fmt:format from the root of the project and it should auto format all of the source to the Google Java format standard. I have this documented way down in the README here. It covers Java source, license checks and XML formatting as well. Also, you will want to enable the DSE profiles too, or the formatting will skip those projects (i.e. -P dse,dse7, trunk).

Miles-Garnsey commented 1 year ago

@olim7t can Erik and I get some feedback from you on the topic of Optionals? The context is that in the NodeOpsProvider, we want to avoid adding keys to the repair options map under several circumstances:

When the option is passed in as null.
When the option is an empty collection.
When the option is the zero value of the type.

This is because we want the NodeOpsProvider logic to service v0-v3 endpoints, and some options are unavailable in the v0/v1 endpoints. We want to avoid adding entries with null values, empty collections, etc. into the options map in this case since we don't know how Cassandra will interpret them.

To facilitate this, I have the new options being passed as Optional<> to the NodeOpsProvider right now, and in the resources layers we will munge the data to retain existing logic and pass empty Optionals when the endpoint doesn't support those options.

But Erik has rightly noted that the calls over the RPC channel aren't type checked, so wrapping things in Optional<> may not be hugely valuable. He would prefer to keep null checks inside the NodeOpsProvider and consolidate emptiness into null values in the resources layer.

Neither of us have a strong feeling on this - do you have any thoughts?

olim7t commented 1 year ago

We want to avoid adding entries with null values, empty collections, etc. into the options map in this case since we don't know how Cassandra will interpret them.

I haven't checked every implementation, but looking at 3.x's RepairOption.parse, it looks pretty safe. It's just looking up known options in the map, if there are a few extra unknown ones, they should be ignored.

Miles-Garnsey commented 1 year ago

I haven't checked every implementation, but looking at 3.x's RepairOption.parse, it looks pretty safe. It's just looking up known options in the map, if there are a few extra unknown ones, they should be ignored.

It isn't so much about unknown options, it is more about setting the option values to null or empty collections. @olim7t

Miles-Garnsey commented 1 year ago

This fixed #405

k8ssandra / management-api-for-apache-cassandra

Reaper endpoints: Async Repair Endpoint #358