Closed kgibm closed 4 months ago
UFO review comments/questions:
Add details on environments tested: bare metal, container etc. - clarify the performance characteristics across environments. If they all show the same then state that
What about volume mounted logs? Will that show different performance behaviors?
Questions about how the VM writes to disk, open/sync/close or writes to an open file that flush over time (async).
Question on if we can warn users when we detect possible slow environments.
How do we inform users they need to look into verbose GC logs?
Should this be disabled for performance benchmarks when comparing other JVMs? No, should benchmark what the customer runs with by default.
Run SOE tests with the option enabled to determine the log file size impact on the build logs gathered during the test runs.
Consider updating tWAS defaults to align with the Liberty defaults being proposed in this UFO
Consider placing the GC logs under a subfolder in the logs/
directory. Something like logs/gclogs/
? Need to check with some team on an opinion there (I forgot to make a note on the team to ask).
Do logs have sensitive data? Noted this will be covered later. Assume this will be discussed the review to finish to the end of the UFO document.
Note that server.env
is available to set the options for environment values.
Verbose GC settings must only apply to start
and run
. Further consideration is needed for the checkpoint
action.
Investigate looking in $JAVA_HOME/release to determine JVM variant. This file is standard Java 9+. Has things like JVM_VARIANT="Openj9"
that could be quickly scanned to determine what options to use.
Considerations for what the native launcher on Z needs for this feature.
InstantOn support will need to consider how verbose GC options can change from checkpoint to restore
Additional communication is needed for Z with websphere Liberty
How to beta this? The server
script (.sh and .bat) need updated and the changes will be the same for GA and beta releases.
etc/server.env
that enables the option for verbose GC.Part 2 UFO Review
Should we do this for hotspot? It is technically possible with a release file to detect
Need to coordinate with checkpoint action until we consume EA OpenJ9 build that has fix for InstantOn
Should there be a new section in the Logs documentation for gc logs? Answer - Yes.
Discussed the fact that verbose GC logs will get put into a checkpoint image (InstantOn). Should be small though, so not concerned.
Discussed security concerns around jvmargs in verbose GC logs. Reach out to Gary Pitcher for concerns. On call we thought it was "safe", but confirmation should be done with the security team.
Discussed log name, decided not to make it unique per server.
User settings for GC log rotations take precidence.
Discussed current customers using -verbose:bc to console log. Now they get good behavior of rotating GC logs. But this is a behavior change
@kgibm can you add a comment to indicate how the socialization feedback was addressed?
@NottyCode Sure. How each item was addressed on slides 38-42 of the UFO; copying in:
@kgibm I'm not seeing the following updates:
@NottyCode
I'm not seeing the following updates:
* Run SOE tests with the option enabled to determine the log file size impact on the build logs gathered during the test runs. Added to System Test Impact slide
Sorry, that's on the Automated Testing slide 28 instead. I'll update the comment.
* Verbose GC settings must only apply to start and run. Further consideration is needed for the checkpoint action. Added to Feature Design slide
On slide 12: "Use SERVER_*_JAVA_OPTIONS so that it only applies to start and run actions, not all actions"
* How to use log files instead of jobs for verbose GC on Z Added to Communication slide
On slide 18: "and how to use HFS/ZFS instead with proper tokens if desired"
@OpenLiberty/demo-approvers Demo scheduled for EOI 24.04
Serviceability Approval Comment - Please answer the following questions for serviceability approval:
UFO -- does the UFO identify the most likely problems customers will see and identify how the feature will enable them to diagnose and solve those problems without resorting to raising a PMR? Have these issues been addressed in the implementation?
Test and Demo -- As part of the serviceability process we're asking feature teams to test and analyze common problem paths for serviceability and demo those problem paths to someone not involved in the development of the feature (eg. L2, test team, or another development team).
a) What problem paths were tested and demonstrated?
b) Who did you demo to?
c) Do the people you demo'd to agree that the serviceability of the demonstrated problem scenarios is sufficient to avoid PMRs for any problems customers are likely to encounter, or that L2 should be able to quickly address those problems without need to engage L3?
SVT -- SVT team is often the first team to try new features and often encounters problems setting up and using them. Note that we're not expecting SVT to do full serviceability testing -- just to sign-off on the serviceability of the problem paths they encountered. a) Who conducted SVT tests for this feature? b) Do they agree that the serviceability of the problems they encountered is sufficient to avoid PMRs, or that L2 should be able to quickly address those problems without need to engage L3?
Which L2 / L3 queues will handle PMRs for this feature? Ensure they are present in the contact reference file and in the queue contact summary, and that the respective L2/L3 teams know they are supporting it. Ask Don Bourne if you need links or more info.
Does this feature add any new metrics or emit any new JSON events? If yes, have you updated the JMX metrics reference list / Metrics reference list / JSON log events reference list in the Open Liberty docs?
@OpenLiberty/serviceability-approvers
Yes, the UFO identifies the most likely problems customers will see, as well as how to debug/solve them. The scenarios have also been tested with FAT testing.
Test and Demo -- As part of the serviceability process we're asking feature teams to test and analyze common problem paths for serviceability and demo those problem paths to someone not involved in the development of the feature (eg. L2, test team, or another development team). a) What problem paths were tested and demonstrated?
VERBOSEGC=false
to server.env turns off logging.VERBOSEGC=false
is in server.env still allows user configuration to work.VERBOSEGC=true
still creates verbosegc log.b) Who did you demo to? Jim Blye c) Do the people you demo'd to agree that the serviceability of the demonstrated problem scenarios is sufficient to avoid PMRs for any problems customers are likely to encounter, or that L2 should be able to quickly address those problems without need to engage L3? Yes
SVT -- SVT team is often the first team to try new features and often encounters problems setting up and using them. Note that we're not expecting SVT to do full serviceability testing -- just to sign-off on the serviceability of the problem paths they encountered. a) Who conducted SVT tests for this feature? No explicit SVT was required but Brian Hanczaryk is the SVT Feature Focal Point b) Do they agree that the serviceability of the problems they encountered is sufficient to avoid PMRs, or that L2 should be able to quickly address those problems without need to engage L3? No explicit SVT was performed for this feature.
Which L2 / L3 queues will handle PMRs for this feature? Ensure they are present in the contact reference file and in the queue contact summary, and that the respective L2/L3 teams know they are supporting it. Ask Don Bourne if you need links or more info.
WAS L2: ADM WAS L3: Kernel
N/A
@OpenLiberty/ste-approvers The STE Slidedeck has been uploaded to the STE Archive.
@OpenLiberty/svt-approvers - There are no SVT requirements for this feature. Please let me know approval can be granted or if anything else is needed.
WASSDK Support is good with the STE slides. Hence approving.
@OpenLiberty/performance-approvers Can you please review the Performance approval for this feature? Please let me know if approval can be granted or if anything else is needed.
@OpenLiberty/instanton-approvers Can you please review the InstantOn approval for this feature? Please let me know if approval can be granted or if anything else is needed.
The developer opened the following documentation issue: https://github.com/OpenLiberty/docs/issues/7240 The ID team has incorporated the updates. The developer has approved the updates. Approving.
This PR is merged and in a release build. Consulted with Eric and Harry....closing issue.... @rsherget @hlhoots
see above comment
Description
By default, verbosegc is not enabled in Liberty (specifically, not enabled by default in Java). This is a problem if a performance or OutOfMemoryError issue occurs as the issue will often need to be reproduced with verbosegc, or users may simply overlook garbage collection performance issues (e.g. thread dumps may point to various application stacks but the underlying issue could be garbage collection). Verbosegc was enabled by default for new profiles in WAS traditional 9.0.0.3 and 9.0.0.4 (z/OS). This epic proposes to enable verbose garbage collection by default on IBM Java/Semeru. Initially discussed in design issue #23001.
Documents
When available, add links to required feature documents. Use "N/A" to mark particular documents which are not required by the feature.
Aha: N/A
UFO: https://ibm.box.com/s/o3isyhh62xixic925g8m7qto5ufgw4nk
FTS: Link to Feature Test Summary GH Issue
Beta Blog: Link to Beta Blog Post GH Issue
GA Blog: Link to GA Blog Post GH Issue
Process Overview
Prioritization
Design
Implementation
Legal and Translation
Beta
GA
Other Deliverables
General Instructions
The process steps occur roughly in the order as presented. Process steps occasionally overlap.
Each process step has a number of tasks which must be completed or must be marked as not applicable ("N/A").
Unless otherwise indicated, the tasks are the responsibility of the Feature Owner or a Delegate of the Feature Owner.
If you need assistance, reach out to the OpenLiberty/release-architect.
Important: Labels are used to trigger particular steps and must be added as indicated.
Prioritization (Complete Before Development Starts)
The (OpenLiberty/chief-architect) and area leads are responsible for prioritizing the features and determining which features are being actively worked on.
Prioritization
[x] Feature added to the "New" column of the Open Liberty project board
[x] Priority assigned
Design (Complete Before Development Starts)
Design preliminaries determine whether a formal design, which will be provided by an Upcoming Feature Overview (UFO) document, must be created and reviewed. A formal design is required if the feature requires any of the following: UI, Serviceability, SVT, Performance testing, or non-trivial documentation/ID.
Design Preliminaries
ID Required
, if non-trivial documentation needs to be created by the ID team.ID Required - Trivial
, if no design will be performed and only trivial ID updates are needed.Design
Design Review Request
Design Approval Request
Design Approved
No Design
No Design Approval Request
No Design Approved
Product Management Approval Request
and notifies OpenLiberty/product-managementProduct Management Approved
(OpenLiberty/product-management)FAT Documentation
[x] "Feature Test Summary" child task created
Implementation
A feature must be prioritized before any implementation work may begin to be delivered (inaccessible/no-ship). However, a design focused approach should still be applied to features, and developers should think about the feature design prior to writing and delivering any code.
Besides being prioritized, a feature must also be socialized (or No Design Approved) before any beta code may be delivered. All new Liberty content must be inaccessible in our GA releases until it is Feature Complete by either marking it
kind=noship
or beta fencing it.Code may not GA until this feature has obtained the "Design Approved" or "No Design Approved" label, along with all other tasks outlined in the GA section.
Feature Development Begins
In Progress
labelLegal and Translation
In order to avoid last minute blockers and significant disruptions to the feature, the legal items need to be done as early in the feature process as possible, either in design or as early into the development as possible. Similarly, translation is to be done concurrently with development. Both MUST be completed before Beta or GA is requested.
Legal (Complete before Feature Complete Date)
Innovation (Complete 1 week before Feature Complete Date)
Translation (Complete by Feature Complete Date)
[ ] PII (Program Integrated Information) updates are merged (i.e. all English strings due for translation have been delivered), or N/A.
Beta
In order to facilitate early feedback from users, all new features and functionality should first be released as part of a beta release.
Beta Code
kind=beta
,ibm:beta
,ProductInfo.getBetaEdition()
target:beta
and the appropriatetarget:YY00X-beta
(where YY00X is the targeted beta version).release:YY00X-beta
(where YY00X is the first beta version that included the functionality).Beta Blog (Complete by beta eGA)
[ ] Beta blog issue created and populated using the Open Liberty BETA blog post template.
GA
A feature is ready to GA after it is Feature Complete and has obtained all necessary Focal Point Approvals.
Feature Complete
Translation - Complete
orTranslation - Missing
labelrelease
branch, feature owner adds labelTranslation - Complete
.Translation - Missing
.Translation - Missing
label is replaced withTranslation - Complete
.Translation - Blocked
label.Translation - Blocked
may NOT proceed to GA until the label has been replaced with eitherTranslation - Missing
orTranslation - Complete
.target:ga
and the appropriatetarget:YY00X
(where YY00X is the targeted GA version).Focal Point Approvals (Complete by Feature Complete Date)
These occur only after GA of this feature is requested (by adding a
target:ga
label). GA of this feature may not occur until all approvals are obtained.All Features
focalApproved:externals
@OpenLiberty/demo-approvers Demo scheduled for EOI [Iteration Number]
to this issue.focalApproved:demo
.focalApproved:fat
.Design Approved Features
focalApproved:id
.focalApproved:instantOn
.focalApproved:performance
.focalApproved:sve
.focalApproved:ste
.focalApproved:svt
.Remove Beta Fencing (Complete by Feature Complete Date)
GA Blog (Complete by Friday after GM)
Post GM (Complete before GA)
Post GA
[ ] Remove the
target:ga
andtarget:YY00X
labels, and add the appropriaterelease:YY00X
. (OpenLiberty/release-manager)Other Deliverables
[ ] Standalone Feature Blog Post - A blog post specifically about your feature or N/A. (Feature owner and OpenLiberty/release-architect)
[ ] OL Guides - OL Guides assessment is complete or N/A. (OpenLiberty/guide-assessment)
[ ] Dev Experience - Developer Experience & Tools work is complete or N/A. (OpenLiberty/dev-experience-assessment)