Closed tjwatson closed 3 months ago
Feedback from UFO meeting, part 1:
Slide 9
Slide 17
org.crac
is for all applications to use and that io.openliberty.checkpoint.spi
is for the runtime to use.Slide 22
Slide 23
checkpointRestore()
checkpointRestore()
method. Can always add later if demand arises. checkpointRestore()
methodSlide 28
Slide 29
Slide 36
Slide 38
@OpenLiberty/demo-approvers Demo scheduled for EOI [23.17]
not sure what I did to close this so reopening
OL:
Serviceability Approval Comment - Please answer the following questions for serviceability approval:
UFO -- does the UFO identify the most likely problems customers will see and identify how the feature will enable them to diagnose and solve those problems without resorting to raising a PMR? Have these issues been addressed in the implementation? Yes, there are multiple slides in the updated UFO that identify information. Reviewed by the Open Liberty Kernel team.
Test and Demo -- As part of the serviceability process we're asking feature teams to test and analyze common problem paths for serviceability and demo those problem paths to someone not involved in the development of the feature (eg. IBM Support, test team, or another development team).
a) What problem paths were tested and demonstrated? All common error paths.
b) Who did you demo to? Open Liberty Kernel Team
c) Do the people you demo'd to agree that the serviceability of the demonstrated problem scenarios is sufficient to avoid PMRs for any problems customers are likely to encounter, or that IBM Support should be able to quickly address those problems without need to engage SMEs? Yes, the Open Liberty Kernel team believes the problem scenarios are sufficient to avoid PMRs.
SVT -- SVT team is often the first team to try new features and often encounters problems setting up and using them. Note that we're not expecting SVT to do full serviceability testing -- just to sign-off on the serviceability of the problem paths they encountered. a) Who conducted SVT tests for this feature? b) Do they agree that the serviceability of the problems they encountered is sufficient to avoid PMRs, or that IBM Support should be able to quickly address those problems without need to engage SMEs?
Which IBM Support / SME queues will handle PMRs for this feature? Ensure they are present in the contact reference file and in the queue contact summary, and that the respective IBM Support/SME teams know they are supporting it. Ask Don Bourne if you need links or more info.
Does this feature add any new metrics or emit any new JSON events? If yes, have you updated the JMX metrics reference list / Metrics reference list / JSON log events reference list in the Open Liberty docs?
SVT tested checkpoint/restore using Spring PetClinic sample app with Open Liberty daily beta image stg.icr.io/cp/olc/open-liberty-daily:beta _(Open Liberty 24.0.0.7-beta/wlp-1.0.90.cl240620240517-1201) on Eclipse OpenJ9 VM, version 21.0.2+13-LTS (enUS) Restore was done on Amazon EKS cluster
OL:
Serviceability Approval Comment - Please answer the following questions for serviceability approval:
- UFO -- does the UFO identify the most likely problems customers will see and identify how the feature will enable them to diagnose and solve those problems without resorting to raising a PMR? Have these issues been addressed in the implementation?
Yes, the UFO Serviceability section identifies the likely causes of failures a customer may see when using the feature.
- Test and Demo -- As part of the serviceability process we're asking feature teams to test and analyze common problem paths for serviceability and demo those problem paths to someone not involved in the development of the feature (eg. IBM Support, test team, or another development team). a) What problem paths were tested and demonstrated?
All failures identified int the UFO are demonstrated.
b) Who did you demo to?
To the kernel team
c) Do the people you demo'd to agree that the serviceability of the demonstrated problem scenarios is sufficient to avoid PMRs for any problems customers are likely to encounter, or that IBM Support should be able to quickly address those problems without need to engage SMEs?
Yes, the Open Liberty Kernel team believes the problem scenarios are sufficient to avoid PMRs.
SVT -- SVT team is often the first team to try new features and often encounters problems setting up and using them. Note that we're not expecting SVT to do full serviceability testing -- just to sign-off on the serviceability of the problem paths they encountered.
a) Who conducted SVT tests for this feature?
@tam512
b) Do they agree that the serviceability of the problems they encountered is sufficient to avoid PMRs, or that IBM Support should be able to quickly address those problems without need to engage SMEs?
In SVT, we look for serviceability issues such as error messages and they are clear and helpful.
TBD
- Which IBM Support / SME queues will handle PMRs for this feature? Ensure they are present in the contact reference file and in the queue contact summary, and that the respective IBM Support/SME teams know they are supporting it. Ask Don Bourne if you need links or more info.
The Equinox OSGi squad will provide support. This squad already provides support for InstantOn and also support for the springBoot feature (shared with kernel team).
- Does this feature add any new metrics or emit any new JSON events? If yes, have you updated the JMX metrics reference list / Metrics reference list / JSON log events reference list in the Open Liberty docs?
No
@tam512 , can you please provide your comment for 3b on the serviceability approval:
b) Do they agree that the serviceability of the problems they encountered is sufficient to avoid PMRs, or that IBM Support should be able to quickly address those problems without need to engage SMEs?
For serviceability, the Open Liberty Kernel Team reviewed that content today, and added comments in @donbourne 's comment above.
Adding a new feature to InstantOn list, extended description for new feature CRAC 1.4. Ready, and will display the autogen. Doc issue #7331. Approving feature.
Hello @tngiang73, @gnadell. Can we acquire STE approval on Monday (or sooner) with agreement that Tom will provide the STE deck and coordinate a training session? Tom should be available to consult regarding his progress on the training materials and plans to meet with Support developers. -Regards
FYI @tjwatson: @tngiang73 will provide STE focal approval now under the agreement that we deliver the STE materials by next Monday, 10 June 2024. -Thanks all.
This is done
Description
SpringBoot 3 uses Spring Framework 6.x. Spring Framework version 6.1 (to be released November 2023) is enabling integration with CRaC. See https://docs.spring.io/spring-framework/reference/6.1-SNAPSHOT/integration/checkpoint-restore.html
This is also evident in the current milestone of 6.1 (6.1.0-M1) and can be seen in the spring source code at https://github.com/spring-projects/spring-framework/blob/654dee8cd6fd09314289e9bba92719d57001c539/spring-context/src/main/java/org/springframework/context/support/DefaultLifecycleProcessor.java#L485-L555
This will enable SpringBoot 3 applications to take advantage of checkpoint/restore technologies to rapidly startup the SpringBoot application. Liberty can implement the
org.crac
on top of the Liberty InstantOn support as a separate feature that provides the third-partorg.crac
APIs to applications (here the Spring libraries themselves).This is important because Liberty InstantOn provides an ideal solution to running checkpoint/restore applications in the cloud. Spring's support for CRaC APIs will enable their very large community of developers to easily use Checkpoint/Restore technologies. I expect the Spring Framework will continue to improve their support for CRaC such that it will make it safe to checkpoint Spring Boot applications for production restores.
Liberty InstantOn should be able to provide a seamless experience to allow SpringBoot 3 applications to safely use InstantOn once the Spring Framework supports CRaC APIs when they are present.
Additional context
See https://github.com/sdeleuze/spring-boot-crac-demo for a working example of Spring Boot with CRaC. See https://aboullaite.me/what-the-crac/ See https://github.com/CRaC/org.crac
Documents
When available, add links to required feature documents. Use "N/A" to mark particular documents which are not required by the feature.
Aha: Externally raised RFE (Aha)
UFO: InstantOn - Spring Boot 3.0
FTS: Link to Feature Test Summary GH Issue
Beta Blog: Link to Beta Blog Post GH Issue
GA Blog: Link to GA Blog Post GH Issue
Process Overview
Prioritization
Design
Implementation
Legal and Translation
Beta
GA
Other Deliverables
General Instructions
The process steps occur roughly in the order as presented. Process steps occasionally overlap.
Each process step has a number of tasks which must be completed or must be marked as not applicable ("N/A").
Unless otherwise indicated, the tasks are the responsibility of the Feature Owner or a Delegate of the Feature Owner.
If you need assistance, reach out to the OpenLiberty/release-architect.
Important: Labels are used to trigger particular steps and must be added as indicated.
Prioritization (Complete Before Development Starts)
The (OpenLiberty/chief-architect) and area leads are responsible for prioritizing the features and determining which features are being actively worked on.
Prioritization
[x] Feature added to the "New" column of the Open Liberty project board
[x] Priority assigned
Design (Complete Before Development Starts)
Design preliminaries determine whether a formal design, which will be provided by an Upcoming Feature Overview (UFO) document, must be created and reviewed. A formal design is required if the feature requires any of the following: UI, Serviceability, SVT, Performance testing, or non-trivial documentation/ID.
Design Preliminaries
ID Required
, if non-trivial documentation needs to be created by the ID team.ID Required - Trivial
, if no design will be performed and only trivial ID updates are needed.Design
Design Review Request
Design Approval Request
Design Approved
No Design
No Design Approval Request
No Design Approved
Product Management Approval Request
and notifies OpenLiberty/product-managementProduct Management Approved
(OpenLiberty/product-management)FAT Documentation
[x] "Feature Test Summary" child task created
Implementation
A feature must be prioritized before any implementation work may begin to be delivered (inaccessible/no-ship). However, a design focused approach should still be applied to features, and developers should think about the feature design prior to writing and delivering any code.
Besides being prioritized, a feature must also be socialized (or No Design Approved) before any beta code may be delivered. All new Liberty content must be inaccessible in our GA releases until it is Feature Complete by either marking it
kind=noship
or beta fencing it.Code may not GA until this feature has obtained the "Design Approved" or "No Design Approved" label, along with all other tasks outlined in the GA section.
Feature Development Begins
In Progress
labelLegal and Translation
In order to avoid last minute blockers and significant disruptions to the feature, the legal items need to be done as early in the feature process as possible, either in design or as early into the development as possible. Similarly, translation is to be done concurrently with development. Both MUST be completed before Beta or GA is requested.
Legal (Complete before Feature Complete Date)
Translation (Complete 1 week before Feature Complete Date)
Innovation (Complete 1 week before Feature Complete Date)
[x] Consider whether any aspects of the feature may be patentable. If any identified, disclosures have been submitted.
Beta
In order to facilitate early feedback from users, all new features and functionality should first be released as part of a beta release.
Beta Code
kind=beta
,ibm:beta
,ProductInfo.getBetaEdition()
target:beta
and the appropriatetarget:YY00X-beta
(where YY00X is the targeted beta version).release:YY00X-beta
(where YY00X is the first beta version that included the functionality).Beta Blog (Complete 1.5 weeks before beta eGA)
[x] Beta blog issue created and populated using the Open Liberty BETA blog post template.
GA
A feature is ready to GA after it is Feature Complete and has obtained all necessary Focal Point Approvals.
Feature Complete
target:ga
and the appropriatetarget:YY00X
(where YY00X is the targeted GA version).Focal Point Approvals (Complete by Feature Complete Date)
These occur only after GA of this feature is requested (by adding a
target:ga
label). GA of this feature may not occur until all approvals are obtained.All Features
focalApproved:externals
@OpenLiberty/demo-approvers Demo scheduled for EOI [Iteration Number]
to this issue.focalApproved:demo
.focalApproved:fat
.focalApproved:globalization
.Design Approved Features
focalApproved:id
.focalApproved:performance
.focalApproved:sve
.focalApproved:ste
.focalApproved:svt
.Remove Beta Fencing (Complete by Feature Complete Date)
GA Blog (Complete by Feature Complete Date)
Post GA
[ ] Replace
target:YY00X
label with the appropriaterelease:YY00X
. (OpenLiberty/release-manager)Other Deliverables
[ ] Standalone Feature Blog Post A blog post specifically about your feature or N/A. (OpenLiberty/release-architect)
[ ] OL Guides OL Guides assessment is complete or N/A. (OpenLiberty/guide-assessment)
[ ] Dev Experience Developer Experience & Tools work is complete or N/A. (OpenLiberty/dev-experience-assessment)