aws / random-cut-forest-by-aws

An implementation of the Random Cut Forest data structure for sketching streaming data, with support for anomaly detection, density estimation, imputation, and more.
https://github.com/aws/random-cut-forest-by-aws
Apache License 2.0
206 stars 33 forks source link

Maven release and snapshot workflows #334

Closed amitgalitz closed 2 years ago

amitgalitz commented 2 years ago

Issue #, if available:

303

301

Description of changes:

maven-release workflow:

This workflow is executed manually by a repo maintainer when releasing an official version to maven. It will publish the stable release to a staging repo where with one click we can publish it fully to maven central. It will also create a tag and a github release. If someone accidentally runs this when version still has -SNAPSHOT, there is extra safety to stop workflow.

maven-snapshot:

releases a snapshot when commit is pushed to main and version is -SNAPSHOT.

Changed all version numbers to 3.0-rc4-SNAPSHOT. We will be changing to a better practice in what version RCF is on when developing and releasing. The new process will involve incrementing to the next version right after a release as all new development is technically a part of the next release. We are also adding a -SNAPSHOT qualifier to the end of the version to indicate that this current version is during development.

side-note: it is up to us to decide if we want to move the next development iteration to 3.0-rc4-SNAPSHOT or 3.0-SNAPSHOT and that is something I would like feedback on.

General flow example:

  1. 3.0.0 is released by manual click run on the new github maven-release workflow (more details in bottom)
  2. We submit a PR to increment to 3.1.0-SNAPSHOT (we are actively developing features for 3.1 release)
    • This can be done easily with one command mvn versions:set -DnewVersion=3.1.0-SNAPSHOT -DgenerateBackupPoms=false
  3. We are actively developing features that will be a part of 3.1 release, on every merge to main the latest snapshot can be fetched from the snapshot repository
  4. We decide that we want to release 3.1 as on official release, we now submit a PR to bump the version from 3.1.0-SNAPSHOT to 3.1.0.
    • This can be done with one easy command: mvn versions:set -DremoveSnapshot -DgenerateBackupPoms=false
  5. We manually run (1 click) the maven-release workflow, new tag is created, a new Github release created, and artifacts are uploaded to staging repository. To finalize this release we login to nexus and ensure it is closed and click release to officially push to maven central.

The last step where we release to maven, we are actually only releasing to the staging repository where we have to login and click release one more time. This adds an extra layer of security and fault tolerance, however we can easily change this in the future to skip the staging part. For now I think we should first release just to the staging repository at least for the first official release with this new workflow.

Next steps:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

sudiptoguha commented 2 years ago

Thanks for the PR. But (a) this PR should be serialized after PR 333 -- currently there is a version in Maven which is most recent and another RC4 is not needed. (b) the next planned version (after 333 is resolved) should be 3.0.0. What is schedule for testing/benchmarking to be completed ?

sean-zheng-amazon commented 2 years ago

@sudiptoguha we are definitely prioritizing the benchmarking work, and the experiments will be wrapped up by end of June. Meanwhile since this PR is more about the ci/cd set up, should we have it in first to benefit future releases?

amitgalitz commented 2 years ago

Thanks for the PR. But (a) this PR should be serialized after PR 333 -- currently there is a version in Maven which is most recent and another RC4 is not needed. (b) the next planned version (after 333 is resolved) should be 3.0.0. What is schedule for testing/benchmarking to be completed ?

I don't think the PR itself needs to wait for the testing/benchmarking work or to be after pr 333 is merged. However if you believe rc4 doesn't make since I can change the version to be 3.0-SNAPSHOT instead. 3.0-rc3-SNAPSHOT wouldn't make sense as that is supposed to be before 3.0-rc3 official release which already happened.

sudiptoguha commented 2 years ago

As I have said above, let me say it again, we will prioritize PR 333 based on the scientific mission and content of the library. The issues refered to in the PR are 90+ days old, it is unseemly to clamor for a fast action after 90+ days of delays. We will discuss this PR after benchmarks are done (and 333 is resolved) -- (a) some changes may be required (I do not think so, but there is no need to prejudge) and (b) I believe that holding a firm line is the only way the benchmarking will ever be completed up to scientific fidelity.

sudiptoguha commented 2 years ago

Given that 333 is now merged, can the version numbers be changed to 3.0.0 ? We begin the release to maven -- unless of course there are emergent issues from the benchmarking/etc. tests.

dblock commented 2 years ago

Can the 3.0 version increment be done separately by the person releasing the library, and get this merged? That version increment otherwise would be conflated with the GH workflows added here. This is just tooling. I'd want to see a separate commit that says "version now is 3.0.0".

sudiptoguha commented 2 years ago

https://github.com/aws/random-cut-forest-by-aws/pull/335

sudiptoguha commented 2 years ago

So far the assumption was that Amit was the person releasing the library, but the increment can happen independent of him.

dblock commented 2 years ago

@amitgalitz looks like the version conflicts in this PR need to be resolved, but looking forward to snapshot/build/release automation!

sean-zheng-amazon commented 2 years ago

@sudiptoguha agree that we should now bump the version to 3.0, Amit will do that when he back. Meanwhile is there any other concern/dependencies before we merge this PR to enable the ci/cd?

sudiptoguha commented 2 years ago

The version discussion was never a matter of negotiation, I am not sure how that got mixed in an automation PR in the first place.

That being said I am reading next steps as "I spent some time trying to convert the repository over to gradle since gradle makes it a lot easier to create custom tasks, a more concise build script and overall better development experience. Gradle also is known to have much faster build times. However I was having multiple different errors with aligning dependencies and having everything work. I decided this should be a different task as we want to make sure we do it right and ensure the jar made from gradle is identical to maven. Create more workflows for auto incrementing versions and adding code coverage".

That seems to be the bar. Open source should raise bars.

If there are multiple different errors with aligning dependencies of this library (which has minimal dependencies) then I foresee that exact same issue will be present in every automation script that consumes this library. @dblock, thoughts?

Why not do things right? It is of course ok to stage things -- but halfway solutions are habit of being requiring attention again.

dblock commented 2 years ago

I don't have any strong opinions of whether replacing maven with gradle is worth it. I do sometimes like that Gradle helps avoid switching between -SNAPSHOT versions in pom.xml by hand, but that's about it.

dblock commented 2 years ago

I rebased this PR for ya, https://github.com/aws/random-cut-forest-by-aws/pull/337, in case you're ready to merge it.

amitgalitz commented 2 years ago

@dblock @sudiptoguha Just read through all the comments as I was out for a little bit. Since there might be some issues showing up from benchmarking I still think there is a case to be made to making the version 3.0.0-SNAPSHOT until that's been decided but we can also just keep at 3.0.0 for now and not do the move back to -SNAPSHOT and then back to 3.0.0. Other then that I can rebase this PR and close the one you made if that is okay @dblock? In regards for gradle, I don't think Maven is a halfway solution if I made it seem like that in my description. For RCF currently it provides the benefits of handeling -SNAPSHOT to release more smoothly and faster build times (which are documented as a whole but not for this repo yet) but maven is still a good solution. I think it shouldn't be a reason to stop this PR but it could be part of a larger issue to potentially move to gradle and integrate other things like code coverage as I assume this issue is referring too (https://github.com/aws/random-cut-forest-by-aws/issues/97)

sean-zheng-amazon commented 2 years ago

+1. I honestly don't see obvious benefits converting maven to gradle for rcf, especially considering the minimal dependencies the library has (gradle is more flexible on dependency management rules for sure). There are pros and cons between them, but both are used by a majority of java applications for production. I do agree with Amit that we should merge this to enable ci/cd asap and looking into gradle conversion in parallel, but we should also be open to suggestions/concerns from repo owners. @sudiptoguha @jotok , thoughts?

sudiptoguha commented 2 years ago

"Since there might be some issues showing up from benchmarking I still think there is a case to be made to making the version 3.0.0-SNAPSHOT until that's been decided " --> You should feel free to make any case but that discussion is moot at this point. If there is a reproducible bug with the most recent code in RCF library, please file an issue as is acceptable practice in open source. Otherwise any such discussion is non-actionable.

You have every right to be concerned about potential issues of using RCF in the AD plugin and you should exercise those rights in the discussion forum of the AD plugin. I should point out that if AD plugin did end up benchmarking then maybe AD plugin would become more responsible about it's use of upstream software. I was hoping that going through a scientific measurement process would be helpful for the AD plugin. Given that the plugin is a wrapper, it begs the question what are the impacts of the wrapper on the overall end-to-end user experience (from precision/recall or resource consumption). It is worth noting that there always has been a "randomcutforest-benchmark" package in RCF. The benchmarking was never necessary for RCF.

dblock commented 2 years ago

@amitgalitz you should rebase this one, I thought you were out for a while, I closed my PR

Re: version. I am used to seeing 3.0.0-SNAPSHOT in source code, and every commit gets built and published as a snapshot, until the maintainer (@sudiptoguha) decides to release, at that time they switch it to 3.0.0, release, and increment to 3.0.1-SNAPSHOT.

@sudiptoguha Are you saying something else? Other than a rebase, what needs to happen in this PR to get merged?

sudiptoguha commented 2 years ago

answering @dblock -- I (and Josh) are ready to release RCF 3.0 to maven. That was the PR 335.

Since this project had no resourcing, it was my understanding that Amit would be the person to release to Maven.

Independently AD plugin required tooling for its build.

It seems to me that the requirement of tooling has bled into the decision mechanism of who gets to decide RCF is released.

dblock commented 2 years ago

Thanks @sudiptoguha, I understand! Let's leave tooling alone.

This PR 1) adds a workflow to publish SNAPSHOT builds to maven, and 2) adds a workflow to publish the final release to maven. I think both make maintainers life a lot easier, and since the code is ready to release I recommend rebasing this PR, merging, ensuring -SNAPSHOT works, then using the new workflow to release. This way we know it works for next time. And I think @amitgalitz doing it sounds good to me ;)

amitgalitz commented 2 years ago

Given that 333 is now merged, can the version numbers be changed to 3.0.0 ? We begin the release to maven -- unless of course there are emergent issues from the benchmarking/etc. tests

I didn't mean to make it seem like I am deciding when RCF should be released. I just wanted to make sure I address this previous comment and be transparent of what I am trying to reproduce on my end, but as I still haven't been able to reproduce the issue in a unit test it's fair to not base it off this. I agree the correct way would be to reproduce and open a Github issue on RCF not base RCF on some things I notice on AD and so far I am not able to to 100% say I have a new reproducible bug.

I'll rebase this PR now, the only conflicting messaging I am getting is that if I want to test out the -SNAPSHOT I should change the version to 3.0.0-SNAPSHOT for now, make sure that works on this PR merging and then submit a new PR changing the version to 3.0.0 for the official release (all today). If I want to test out the snapshot workflow which is less invasive first as @dblock is recommending. Is that okay or should I just go ahead with 3.0.0 as is right now?

sudiptoguha commented 2 years ago

I am not following - someone else can chime in. Isn't the following simpler:

a. Release the current version to Maven b. change to 3.0.0-SNAPSHOT, that way a snapshot refers to the most recent version in Maven. Otherwise there is no 3.0.0 to be made "snapshot" of ... the 3.0.0-SNAPSHOT can stay.

In this alternate path, if there are issues (and time required) in testing "-SNAPSHOT" then this library is not subject to a number of changes for an indefinite amount of time (testing can take time).

dblock commented 2 years ago

I am not following - someone else can chime in. Isn't the following simpler:

Sure, if you feel that way. My argument to do it now was that If there are bugs in the release workflow you won't know until the next release. It's NBD though!

a. Release the current version to Maven b. change to 3.0.0-SNAPSHOT, that way a snapshot refers to the most recent version in Maven. Otherwise there is no 3.0.0 to be made "snapshot" of ... the 3.0.0-SNAPSHOT can stay.

After you release 3.0 you should never have another 3.0 build, so it should be incremented to the next developer iteration, 3.0.1-SNAPSHOT. Otherwise it will be confusing to your users.

sudiptoguha commented 2 years ago

@dblock - makes sense. @amitgalitz (A) please release the current version to maven and (B) use 3.0.1-SNAPSHOT

There is a possibility that there are issues at next release but let's not change versions back and forth. There will be bugs in this RCF code (now or some day). They will get fixed (to the ability we possess).

amitgalitz commented 2 years ago

Okay rebasing this PR so its set to 3.0.0 and then will run release workflow and then will increment codebase to 3.0.1

amitgalitz commented 2 years ago

rebase had wrong commit for some reason, doing it again

sudiptoguha commented 2 years ago

Thanks @amitgalitz!

dblock commented 2 years ago

@amitgalitz Do you want to open a new proposal for moving the infrastructure here to Gradle with the pros/cons/benefits and getting @sudiptoguha's opinion before you/anyone else does more work on it?