adoptium / temurin-build

Eclipse Temurin™ build scripts - common across all releases/versions
Apache License 2.0
1.02k stars 248 forks source link

Discuss capturing release note worthy OpenJDK issues during a release #3044

Open tellison opened 2 years ago

tellison commented 2 years ago

Temurin releases contain noteworthy changes from OpenJDK and Temurin projects. While we will capture Temurin project changes ourselves as part of ongoing development (e.g. consider tagged issues), the OpenJDK project uses JIRA for tracking such noteworthy changes.

Aleksey Shipilëv is already capturing these in his backports monitor and shares the code containing the required JIRA query to extract these correctly.

For the purpose of documenting Temurin releases fully, we should be capturing the results of that JIRA query in our releases directory (example) alongside the binaries, etc (or part of the SBOM?) etc. so that we know what went into that build.

Opened as a discussion as not sure that we should be pulling directly from Aleksey's site, but also don't wish to diverge/duplicate effort in this space; so maybe a local fork copy and ensuring it is structured to allow for only extracting the rel notes part in a suitable format. Other ideas?

zdtsw commented 2 years ago

I am just curious, why we have two different git repo:s for each jdk version. e.g https://github.com/adoptium/jdk18u and https://github.com/adoptium/temurin18-binaries I understand the later one is purely used as storage for nightly and official releases. But any problem if we push binaries into the former one, then we have both source code and binaries in the same repo. Here we can use GH "Generate Release Notes" function with two tags (*ga) for official release (might even for pre-release nightly build) Surely, a list of all git commits from source code is not as nice as query JIRA. But if JIRA only show "fixed version/s" on jdk version (e.g 8,17,18) level, not to CPU/PSU level, then we will have a release notes with all JIRA info even it is a CPU

tellison commented 2 years ago

I am just curious, why we have two different git repo:s for each jdk version. e.g https://github.com/adoptium/jdk18u and https://github.com/adoptium/temurin18-binaries I understand the later one is purely used as storage for nightly and official releases. But any problem if we push binaries into the former one, then we have both source code and binaries in the same repo.

Somewhat historical, as the original OpenJDK code was all based in Mercurial repositories, and we were mirroring it into GitHub to better integrate into our build/test/distribute processes. OpenJDK have moved many repos into GitHub now, but having a plain mirror independent of the binaries distribution repo is still a handy distinction. Hopefully we won't need to recreate the mirrors again now.

Here we can use GH "Generate Release Notes" function with two tags (*ga) for official release (might even for pre-release nightly build) Surely, a list of all git commits from source code is not as nice as query JIRA. But if JIRA only show "fixed version/s" on jdk version (e.g 8,17,18) level, not to CPU/PSU level, then we will have a release notes with all JIRA info even it is a CPU

Producing release notes from JIRA requires just a little more logic than a list of the GitHub commits or all fixed JIRAs, as shown by Aleksey's code, so yes we don't want to just pick up everything tagged by major fix version.

Aleksey's code is capable of outputting the selected issues summary in text and html format - we'd probably want to capture them in json so they can be rendered on the website or queried as release notes.

BethGriggs commented 1 year ago

Based on https://github.com/adoptium/website-v2/pull/1029#issuecomment-1253353724, I've tried to capture a proposed flow of where release notes should be generated, and where the output should be published, and how the website will fetch the data to render.

flowchart TD
    OpenJDK_Version[/OpenJDK Version/] --> job[Release Notes Jenkins Job]
    job --> | Queries | CVE_data[(CVE Data)] --> job
    job --> | Queries | Notes_data[(Release Notes Data)] --> job
    job[Generate Release Notes] --> | Publishes | GitHub    
    GitHub[(GitHub Release Assets)]
    Website[Adoptium Website] --> | Fetches | GitHub

Open questions:

jiekang commented 1 year ago

I think well-formed JSON is similarly human-readable as txt, so I'd +1 that for the intermediate data

tellison commented 1 year ago

Let me add my 2c to the open questions:

  • Is storing the release-notes as a GitHub asset a reasonable approach?

Yes, I think this is a reasonable approach. Release notes are a "deliverable asset" from Adoptium alongside the binary, SBOM, signatures, etc. The temurin-build folks can help to produce that asset.

  • What format(s) do we want the raw/intermediate format to be published in?
    • Thoughts: JSON makes it easy to parse/render on the website, txt makes it human-readable, and therefore useful, as standalone asset.

The raw assets should be primarily machine readable, and as a bonus readable by humans - so json fits the bill. These assets are designed to be consumed by multiple "clients" (not just humans), one of which is the website that converts the release notes asset into a human-readable format. Others include scripts and parsers that use release note information for analysis and other tasks.

OpenJDK Jira is the definitive data source, and so we should use that. The prototype is helpful and should be used as a basis for an explicit extraction task (script) run as part of the build process for each retained build set.

  • The tool currently outputs release notes in only .txt and .html formats. We could extend it to output JSON if that's the intermediate format we're aiming for.

Not necessary, just pull straight from OpenJDK Jira. Aleksey's tool is an example we can look at, but not depend upon directly.

  • Where can be obtain CVE information for a given OpenJDK release?

    • As we'll always be generating release notes after the upstream release has gone out, we should be able to find a public source. If not, we may need to find a way of manually supplying which CVEs are known to be fixed in a given release, and ensure it's aggregated into the release notes in a consistent format.

I think that is an open question. IIUC, CVE info requires a login to Jira, and we don't have a suitable login for an extraction tool to use. I would start by looking at the manual process temporarily while we have that discussion about the safe way to get CVE information. @jerboaa ?

jerboaa commented 1 year ago
  • Where can be obtain CVE information for a given OpenJDK release?

    • As we'll always be generating release notes after the upstream release has gone out, we should be able to find a public source. If not, we may need to find a way of manually supplying which CVEs are known to be fixed in a given release, and ensure it's aggregated into the release notes in a consistent format.

I think that is an open question. IIUC, CVE info requires a login to Jira, and we don't have a suitable login for an extraction tool to use. I would start by looking at the manual process temporarily while we have that discussion about the safe way to get CVE information. @jerboaa ?

https://openjdk.org/groups/vulnerability/advisories/ which get published via vulnerability-announce@openjdk.org for every critical patch update would be a starting point. Perhaps it would be worth discussing making this info more machine readable. This would probably need to get discussed within the vulnerability group.

gnu-andrew commented 1 year ago

As far as I'm aware, there is no CVE information in the OpenJDK JIRA, at least not publicly. All the security issues are kept private to Oracle.

We can propose providing the information from the vulnerability group in a machine-readable format. I doubt it will be possible for the upcoming release in two weeks though. If you have an example of what you would want to parse, that would help a lot.

BethGriggs commented 1 year ago

I wrote a very small script in https://github.com/BethGriggs/release-notes-prototype to attempt to pull out the JSON we need. Also pushed an early example of the JSON output.

I think we need agreement on what properties/fields we need to capture. So far I have the following:

 {
    "id": "JDK-8278472",
    "title": "Invalid value set to CANDIDATEFORM structure",
    "description": "According to the Windows API reference[1], dwStyle of CANDIDATEFORM structure should be set to CFS_CANDIDATEPOS or CFS_EXCLUDE. So, CFS_POINT is wrong here.\r\n  \r\nSee line 3914 in src\\java.desktop\\windows\\native\\libawt\\windows\\awt_Component.cpp [2], AwtComponent::SetCandidateWindow function:\r\n        CANDIDATEFORM cf;\r\n        cf.dwStyle = CFS_POINT;\r\n        ImmGetCandidateWindow(hIMC, 0, &cf);\r\n\r\n[1] https://docs.microsoft.com/en-us/windows/win32/api/imm/ns-imm-candidateform\r\n[2] https://github.com/openjdk/jdk/blob/f90425a1cbbc686045c87086af586e62f05f6c49/src/java.desktop/windows/native/libawt/windows/awt_Component.cpp#L3914",
    "priority": "3",
    "component": "client-libs",
    "subcomponent": "client-libs/java.awt:i18n",
    "link": "https://bugs.openjdk.java.net/browse/JDK-8278472",
    "type": "Bug"
  },

Here's an example REST API call%20AND%20(resolution%20not%20in%20(%22Won%27t%20Fix%22%2C%20%22Duplicate%22%2C%20%22Cannot%20Reproduce%22%2C%20%22Not%20an%20Issue%22%2C%20%22Withdrawn%22))%20AND%20(labels%20not%20in%20(release-note%2C%20testbug%2C%20openjdk-na%2C%20testbug)%20OR%20labels%20is%20EMPTY)%20AND%20(summary%20!~%20%22testbug%22)%20AND%20(summary%20!~%20%22problemlist%22)%20AND%20(summary%20!~%20%22problem%20list%22)%20AND%20(summary%20!~%20%22release%20note%22)%20AND%20(issuetype%20!%3D%20CSR)%20AND%20fixVersion%3D11.0.16&maxResults=1) which demonstrates all the data we get, per issue, from the API I am using.

(The script is a few lines of JS currently because that's the easiest language for me to get my thoughts out. It might be able to be converted to Bash + jq - if that is considered more maintainable for this project.)

BethGriggs commented 1 year ago

Update

I now have two scripts in https://github.com/BethGriggs/release-notes-prototype (see documentation in that repository)

For restricted/non-public JDK issues we just include the commit message as the title and the JDK number. I have also run a number of tests/comparisons with other release note sources to gain confidence that the script output is valid.

Next steps

I'm on vacation until Thursday next week, but I have given @sxa write access to the repository while it's still on my account in case any tweaks are necessary in my absence.

smlambert commented 1 year ago

A good candidate location for the release notes script is the https://github.com/adoptium/github-release-scripts/ repository. (related: https://github.com/adoptium/github-release-scripts/issues/92).

sxa commented 1 year ago

A good candidate location for the release notes script is the https://github.com/adoptium/github-release-scripts/ repository. (related: adoptium/github-release-scripts#92).

Yep that's my preferred location for this tool too.

smlambert commented 1 year ago

One other possible location would be in this repository as part of the Temurin build.

Why? Its a build artifact and other artifacts like SBOM gets produced as part of the build, so perhaps creation of release notes should occur at this stage (for 1 primary platform like x64 linux (as we do not need separate notes for separate platforms, so can be considered in the same way we do for source.zip with one copy archived as part of the overall pipeline). 2 other reasons to consider putting the scripts here and generating the notes at the compile stage:

BethGriggs commented 1 year ago

An update of where this is at:

Left to do:

sxa commented 1 year ago

JSON files for the latest release (To be considered experimental and subject to change) are in the releases at

tellison commented 1 year ago

Reading the release notes, why do some entries have null values? Peering at the commits/issues I don't see a pattern, e.g. (from OpenJDK17U-jdk-release-notes_17.0.6_10.json)

  {
    "id": "JDK-8274296",
    "title": "8274296: Update or Problem List tests which may fail with uiScale=2 on macOS",
    "priority": null,
    "component": null,
    "subcomponent": null,
    "link": "https://bugs.openjdk.java.net/browse/JDK-8274296",
    "type": null,
    "backportOf": null
  },
BethGriggs commented 1 year ago

@tellison that case is now resolved by @gdams in https://github.com/BethGriggs/release-notes-prototype/commit/8fc1241375638c99fdbaeb49e96148ec2831f9e8 - the Jira query was omitting testbugs. Originally, that was intentional when we were just going to use the output of the Jira API. But, now we're treating the Git history as the source of truth, they're included in the release notes output. The aforementioned fix will mean we have the complete values from Jira now.

Security restricted JDK issues will have values of null as we cannot access the information from the Jira query.

We will need to regenerate and upload the release note assets if we want them to be updated. I've been working on a couple of other fixes that have been noticed too (so it might be worthwhile doing that too).

tellison commented 1 year ago

Got it, thanks @BethGriggs! I'll assume that gets fixed at some point.

I've got some comments on the website's rendering, but will open a website issue for that. I was going back to the raw data to see what was captured.

sxa commented 1 year ago

Follow up to earlier comment https://github.com/adoptium/temurin-build/issues/3044#issuecomment-1406878251 the release noted for 18.0.2.1+1 are now up at https://github.com/adoptium/temurin18-binaries/releases/download/jdk-18.0.2.1%2B1/OpenJDK18U-jdk-release-notes_18.0.2.1_1.json (Still awaiting a cache refresh on the API for it to be visible at https://adoptium.net/temurin/release-notes/?version=jdk-18.0.2.1+1