Open tellison opened 2 years ago
I am just curious, why we have two different git repo:s for each jdk version. e.g https://github.com/adoptium/jdk18u and https://github.com/adoptium/temurin18-binaries I understand the later one is purely used as storage for nightly and official releases. But any problem if we push binaries into the former one, then we have both source code and binaries in the same repo. Here we can use GH "Generate Release Notes" function with two tags (*ga) for official release (might even for pre-release nightly build) Surely, a list of all git commits from source code is not as nice as query JIRA. But if JIRA only show "fixed version/s" on jdk version (e.g 8,17,18) level, not to CPU/PSU level, then we will have a release notes with all JIRA info even it is a CPU
I am just curious, why we have two different git repo:s for each jdk version. e.g https://github.com/adoptium/jdk18u and https://github.com/adoptium/temurin18-binaries I understand the later one is purely used as storage for nightly and official releases. But any problem if we push binaries into the former one, then we have both source code and binaries in the same repo.
Somewhat historical, as the original OpenJDK code was all based in Mercurial repositories, and we were mirroring it into GitHub to better integrate into our build/test/distribute processes. OpenJDK have moved many repos into GitHub now, but having a plain mirror independent of the binaries distribution repo is still a handy distinction. Hopefully we won't need to recreate the mirrors again now.
Here we can use GH "Generate Release Notes" function with two tags (*ga) for official release (might even for pre-release nightly build) Surely, a list of all git commits from source code is not as nice as query JIRA. But if JIRA only show "fixed version/s" on jdk version (e.g 8,17,18) level, not to CPU/PSU level, then we will have a release notes with all JIRA info even it is a CPU
Producing release notes from JIRA requires just a little more logic than a list of the GitHub commits or all fixed JIRAs, as shown by Aleksey's code, so yes we don't want to just pick up everything tagged by major fix version.
Aleksey's code is capable of outputting the selected issues summary in text and html format - we'd probably want to capture them in json so they can be rendered on the website or queried as release notes.
Based on https://github.com/adoptium/website-v2/pull/1029#issuecomment-1253353724, I've tried to capture a proposed flow of where release notes should be generated, and where the output should be published, and how the website will fetch the data to render.
flowchart TD
OpenJDK_Version[/OpenJDK Version/] --> job[Release Notes Jenkins Job]
job --> | Queries | CVE_data[(CVE Data)] --> job
job --> | Queries | Notes_data[(Release Notes Data)] --> job
job[Generate Release Notes] --> | Publishes | GitHub
GitHub[(GitHub Release Assets)]
Website[Adoptium Website] --> | Fetches | GitHub
Open questions:
release-notes
as a GitHub asset a reasonable approach?JSON
makes it easy to parse/render on the website, txt
makes it human-readable, and therefore useful, as standalone asset..txt
and .html
formats. We could extend it to output JSON if that's the intermediate format we're aiming for.I think well-formed JSON is similarly human-readable as txt, so I'd +1 that for the intermediate data
Let me add my 2c to the open questions:
- Is storing the
release-notes
as a GitHub asset a reasonable approach?
Yes, I think this is a reasonable approach. Release notes are a "deliverable asset" from Adoptium alongside the binary, SBOM, signatures, etc. The temurin-build folks can help to produce that asset.
- What format(s) do we want the raw/intermediate format to be published in?
- Thoughts:
JSON
makes it easy to parse/render on the website,txt
makes it human-readable, and therefore useful, as standalone asset.
The raw assets should be primarily machine readable, and as a bonus readable by humans - so json
fits the bill. These assets are designed to be consumed by multiple "clients" (not just humans), one of which is the website that converts the release notes asset into a human-readable format. Others include scripts and parsers that use release note information for analysis and other tasks.
What is the best source of the Release Notes Data?
- OpenJDK Jira via a REST Query (Prototyped in Add Release Notes page website-v2#103)
- The output of running Aleksey's tool.
OpenJDK Jira is the definitive data source, and so we should use that. The prototype is helpful and should be used as a basis for an explicit extraction task (script) run as part of the build process for each retained build set.
- The tool currently outputs release notes in only
.txt
and.html
formats. We could extend it to output JSON if that's the intermediate format we're aiming for.
Not necessary, just pull straight from OpenJDK Jira. Aleksey's tool is an example we can look at, but not depend upon directly.
Where can be obtain CVE information for a given OpenJDK release?
- As we'll always be generating release notes after the upstream release has gone out, we should be able to find a public source. If not, we may need to find a way of manually supplying which CVEs are known to be fixed in a given release, and ensure it's aggregated into the release notes in a consistent format.
I think that is an open question. IIUC, CVE info requires a login to Jira, and we don't have a suitable login for an extraction tool to use. I would start by looking at the manual process temporarily while we have that discussion about the safe way to get CVE information. @jerboaa ?
Where can be obtain CVE information for a given OpenJDK release?
- As we'll always be generating release notes after the upstream release has gone out, we should be able to find a public source. If not, we may need to find a way of manually supplying which CVEs are known to be fixed in a given release, and ensure it's aggregated into the release notes in a consistent format.
I think that is an open question. IIUC, CVE info requires a login to Jira, and we don't have a suitable login for an extraction tool to use. I would start by looking at the manual process temporarily while we have that discussion about the safe way to get CVE information. @jerboaa ?
https://openjdk.org/groups/vulnerability/advisories/ which get published via vulnerability-announce@openjdk.org for every critical patch update would be a starting point. Perhaps it would be worth discussing making this info more machine readable. This would probably need to get discussed within the vulnerability group.
As far as I'm aware, there is no CVE information in the OpenJDK JIRA, at least not publicly. All the security issues are kept private to Oracle.
We can propose providing the information from the vulnerability group in a machine-readable format. I doubt it will be possible for the upcoming release in two weeks though. If you have an example of what you would want to parse, that would help a lot.
I wrote a very small script in https://github.com/BethGriggs/release-notes-prototype to attempt to pull out the JSON we need. Also pushed an early example of the JSON output.
I think we need agreement on what properties/fields we need to capture. So far I have the following:
{
"id": "JDK-8278472",
"title": "Invalid value set to CANDIDATEFORM structure",
"description": "According to the Windows API reference[1], dwStyle of CANDIDATEFORM structure should be set to CFS_CANDIDATEPOS or CFS_EXCLUDE. So, CFS_POINT is wrong here.\r\n \r\nSee line 3914 in src\\java.desktop\\windows\\native\\libawt\\windows\\awt_Component.cpp [2], AwtComponent::SetCandidateWindow function:\r\n CANDIDATEFORM cf;\r\n cf.dwStyle = CFS_POINT;\r\n ImmGetCandidateWindow(hIMC, 0, &cf);\r\n\r\n[1] https://docs.microsoft.com/en-us/windows/win32/api/imm/ns-imm-candidateform\r\n[2] https://github.com/openjdk/jdk/blob/f90425a1cbbc686045c87086af586e62f05f6c49/src/java.desktop/windows/native/libawt/windows/awt_Component.cpp#L3914",
"priority": "3",
"component": "client-libs",
"subcomponent": "client-libs/java.awt:i18n",
"link": "https://bugs.openjdk.java.net/browse/JDK-8278472",
"type": "Bug"
},
Here's an example REST API call%20AND%20(resolution%20not%20in%20(%22Won%27t%20Fix%22%2C%20%22Duplicate%22%2C%20%22Cannot%20Reproduce%22%2C%20%22Not%20an%20Issue%22%2C%20%22Withdrawn%22))%20AND%20(labels%20not%20in%20(release-note%2C%20testbug%2C%20openjdk-na%2C%20testbug)%20OR%20labels%20is%20EMPTY)%20AND%20(summary%20!~%20%22testbug%22)%20AND%20(summary%20!~%20%22problemlist%22)%20AND%20(summary%20!~%20%22problem%20list%22)%20AND%20(summary%20!~%20%22release%20note%22)%20AND%20(issuetype%20!%3D%20CSR)%20AND%20fixVersion%3D11.0.16&maxResults=1) which demonstrates all the data we get, per issue, from the API I am using.
(The script is a few lines of JS currently because that's the easiest language for me to get my thoughts out. It might be able to be converted to Bash + jq - if that is considered more maintainable for this project.)
I now have two scripts in https://github.com/BethGriggs/release-notes-prototype (see documentation in that repository)
fetchCommitList
traverses the Git history between two tags, pulls out the JDK numbers and commit hashes and writes the output to a commits.json
file. fetchReleaseNotes
uses the commits.json
, fetches additional info from Jira, and writes a named release-notes.json
file.
Release notes are currently in this form:
[
{
"id": "JDK-8294333",
"title": "(tz) Update Timezone Data to 2022c",
"priority": "3",
"component": "core-libs",
"subcomponent": "core-libs/java.time",
"link": "https://bugs.openjdk.java.net/browse/JDK-8294333",
"type": "Backport",
"backportOf": "JDK-8292579"
},
...
]
For restricted/non-public JDK issues we just include the commit message as the title and the JDK number. I have also run a number of tests/comparisons with other release note sources to gain confidence that the script output is valid.
release-notes.json
file as a GitHub asset for the release.I'm on vacation until Thursday next week, but I have given @sxa write access to the repository while it's still on my account in case any tweaks are necessary in my absence.
A good candidate location for the release notes script is the https://github.com/adoptium/github-release-scripts/ repository. (related: https://github.com/adoptium/github-release-scripts/issues/92).
A good candidate location for the release notes script is the https://github.com/adoptium/github-release-scripts/ repository. (related: adoptium/github-release-scripts#92).
Yep that's my preferred location for this tool too.
One other possible location would be in this repository as part of the Temurin build.
Why? Its a build artifact and other artifacts like SBOM gets produced as part of the build, so perhaps creation of release notes should occur at this stage (for 1 primary platform like x64 linux (as we do not need separate notes for separate platforms, so can be considered in the same way we do for source.zip with one copy archived as part of the overall pipeline). 2 other reasons to consider putting the scripts here and generating the notes at the compile stage:
An update of where this is at:
Left to do:
FILENAME
parameter to the job so this can be set manually while we're still agreeing.OpenJDK-jdk-19.0.2-ga-release-notes.json
is the current naming, OpenJDK19U-jdk-release-notes_19.0.2_7.json
has been suggested. JSON files for the latest release (To be considered experimental and subject to change) are in the releases at
Reading the release notes, why do some entries have null
values? Peering at the commits/issues I don't see a pattern, e.g. (from OpenJDK17U-jdk-release-notes_17.0.6_10.json
)
{
"id": "JDK-8274296",
"title": "8274296: Update or Problem List tests which may fail with uiScale=2 on macOS",
"priority": null,
"component": null,
"subcomponent": null,
"link": "https://bugs.openjdk.java.net/browse/JDK-8274296",
"type": null,
"backportOf": null
},
@tellison that case is now resolved by @gdams in https://github.com/BethGriggs/release-notes-prototype/commit/8fc1241375638c99fdbaeb49e96148ec2831f9e8 - the Jira query was omitting testbug
s. Originally, that was intentional when we were just going to use the output of the Jira API. But, now we're treating the Git history as the source of truth, they're included in the release notes output. The aforementioned fix will mean we have the complete values from Jira now.
Security restricted JDK issues will have values of null
as we cannot access the information from the Jira query.
We will need to regenerate and upload the release note assets if we want them to be updated. I've been working on a couple of other fixes that have been noticed too (so it might be worthwhile doing that too).
Got it, thanks @BethGriggs! I'll assume that gets fixed at some point.
I've got some comments on the website's rendering, but will open a website issue for that. I was going back to the raw data to see what was captured.
Follow up to earlier comment https://github.com/adoptium/temurin-build/issues/3044#issuecomment-1406878251 the release noted for 18.0.2.1+1 are now up at https://github.com/adoptium/temurin18-binaries/releases/download/jdk-18.0.2.1%2B1/OpenJDK18U-jdk-release-notes_18.0.2.1_1.json (Still awaiting a cache refresh on the API for it to be visible at https://adoptium.net/temurin/release-notes/?version=jdk-18.0.2.1+1
Temurin releases contain noteworthy changes from OpenJDK and Temurin projects. While we will capture Temurin project changes ourselves as part of ongoing development (e.g. consider tagged issues), the OpenJDK project uses JIRA for tracking such noteworthy changes.
Aleksey Shipilëv is already capturing these in his backports monitor and shares the code containing the required JIRA query to extract these correctly.
For the purpose of documenting Temurin releases fully, we should be capturing the results of that JIRA query in our releases directory (example) alongside the binaries, etc (or part of the SBOM?) etc. so that we know what went into that build.
Opened as a discussion as not sure that we should be pulling directly from Aleksey's site, but also don't wish to diverge/duplicate effort in this space; so maybe a local fork copy and ensuring it is structured to allow for only extracting the rel notes part in a suitable format. Other ideas?