opentracing-contrib / java-specialagent

Automatic instrumentation for 3rd-party libraries in Java applications with OpenTracing.
Apache License 2.0
185 stars 46 forks source link

document release procedure #565

Closed andrewhsu closed 4 years ago

andrewhsu commented 4 years ago

Following on from #538 was looking around the https://github.com/opentracing-contrib/java-specialagent/blob/master/.circleci/config.yml and could not find how the release process is executed to verify this. Would like to have some documentation that takes through the steps and how to recover from failure if release process bonks out halfway through.

safris commented 4 years ago

@andrewhsu, the release process is dead simple: SpecialAgent is released to Maven Central, and relies on Sonatype's nexus-staging-maven-plugin, and Maven's maven-deploy-plugin to publish builds to the repos. The the version string in the <version> tag ends in "SNAPSHOT", then maven-deploy-plugin deploys to Maven Central Snapshot Repo. If the version does not end in "SNAPSHOT", then it goes to Maven Central Release Repo.

Builds that get deployed to Maven Central Snapshot Repo can be managed (updated or deleted) via Sonatype's dashboard at https://oss.sonatype.org/.

Builds that get deployed to Maven Central Release Repo cannot be changed once they are released.

The failure recovery process has several layers:

  1. Failure in the "install" phase of the CI/CD process should not ever happen. This is because this should have been caught during development.
  2. Failure in the "test" phase of the CI/CD process can happen due to a couple reasons:
    1. If there is a problem with an integration test, then one should verify that the integration test passes/fails on his own machine. If it fails, then there's a bug. If it passes, then there's something wonky with the CI/CD process, and all that can really be done is to reattempt the build via the CI/CD control panel.
    2. If there is a problem with the CI/CD platform spinning up the container. This can happen if the CI/CD platform fails for some reason due to its own bugs. When SpecialAgent was on Travis CI, such situations would happen almost every time. But, even now that we're on Circle CI, there's an open issue regarding such a situation in #564.
  3. Failure in the "deploy" phase of the CI/CD process can happen due to a couple reasons:
    1. The "deploy" phase relies on binaries compiled and packaged in the "install" phase. Therefore, the "deploy" phase is solely for the deployment of these binaries to the appropriate Maven Repo, either Snapshot or Release. There should be no logical issues with regard to this build architecture, as it has worked this way since day 1.
    2. For some unknown and bizarre reason, I have found that "multi module" releases to Maven Central only work with Maven 3.5.2. I have found this to be true not just for SpecialAgent, but for other projects as well. The .config.yml file already takes care of this, but it's important to note the criticality of it being Maven 3.5.2 -- otherwise, debugging resulting issues could lead to a significant loss of time.
    3. Despite all precautions and surety, it still may happen that the "deploy" phase will fail. This can happen due to the CI/CD container crashing, or due to Sonatype OSS Nexus (the server listening for and accepting deployments) is offline or not functioning properly. Both of these situations have happened before, and this will result in the "deploy" phase failing. There are 3 follow-up scenarios in regard to the "deploy" phase failing:
      1. The failure happens before binaries are pushed to Sonatype OSS Nexus. In this case, just restart this phase of the build from the CI/CD control panel.
      2. The failure happens after all binaries are pushed to Sonatype OSS Nexus (this happens when Maven does not receive a "success" callback that all binaries have been successfully staged and deployed). In this case, everything has most likely deployed correctly. A simple check is to go to the location of the SpecialAgent's binaries in Maven Central, and ensure that the new version is there.
      3. The failure happens in the middle of the binaries being pushed to Sonatype OSS Nexus. This has happened once, and resulted in a long and painful recovery process. The issue with such a situation is that deployments to Maven Central Release Repo are permanent. Therefore, you cannot just "try again". There are 2 ways to fix this:
        1. Figure out which modules were successfully published at the location of the SpecialAgent's binaries in Maven Central, and figure out which modules are missing. Then, alter the module manifest in the pom.xml to only list the modules that should be published. Note, however, that it's much more complicated than it sounds. Because, omitting modules from this list will also skip them from being built, and will definitely result in build failure during the "install" and/or "test" phases. Therefore, some custom "profile" must be created that identifies only the modules that need to be attempted to be deployed again (omitting the modules that were already deployed). This is a complex task to achieve, and involves "scalpel-like" modifications to the pom.xml file(s). If a module that's already been deployed is attempted to be re-deployed, then this will fail the entire build and the rest of the module deployments.
        2. Alternatively, just increment the patch version number in the version string, and run the build and deployment from the beginning. This means that there will remain a build of SpecialAgent that is partially deployed and non-functional.

Some further notes:

  1. The build compiles the following modules to jdk1.7 class version:
    1. opentracing-specialagent-util
    2. opentracing-adapter
    3. opentracing-specialagent-api
    4. specialagent-fingerprint
    5. specialagent-maven-plugin
    6. opentracing-specialagent
      The rule/* modules, however, are compiled with various class versions. This is due to the fact that some 3rd-party libs inherently require higher class versions, so instrumentation integrations can only be made for these higher class versions. This results in a packaging of SpecialAgent that contains a mixed bag of integrations compiled for different class versions. This is fine, however, because for the 3rd-party libs requiring higher class versions to function, one would be running his application with the appropriate JVM version for the 3rd-party lib anyway -- i.e. akka-http integration requires jdk8, but one would not be running Akka on jdk7 for there to be a problem.

      The test/* modules are also a mixed bag of class versions. This is not an issue, however, as these modules are only relevant during the build process.

      To see which rule/* or test/* module is compiled to the jdk8 class version, execute the following:

      grep '<maven.compiler.target>1.8</maven.compiler.target>' rule/*/pom.xml
      grep '<maven.compiler.target>1.8</maven.compiler.target>' test/*/pom.xml
  2. The "install" and "test" phases execute on the jdk8 and jdk11 platforms. This is to assert logical parity of compilation, unit and integration test results for pre-jdk9 and post-jdk9 class versions.
  3. The "deploy" phase relies on the jdk8 compiler, and deploys binaries that are built to a jdk1.7 class version.
andrewhsu commented 4 years ago

@safris thanks for that!