nerc-project / operations

Issues related to the operation of the NERC OpenShift environment
1 stars 0 forks source link

Create roadmap for planning RHOAI updates - Draft 1 #547

Closed dystewart closed 6 days ago

dystewart commented 2 months ago

Closing and replacing with this issue.

Seeing as we are well behind the most recent releases of openshift ai we will want to plan out which versions we wnat to stop and test new features on. Here is the release history in the github repo for starters:

We are currently at version 1.33.0 in production and test clusters.

A tentative roadmap:

Prereqs [April 25 - May 3] Resolution of the following issues:

Phase1 [April 25th - May 3rd]: determine the RHOAI and RHEL versions to install

Our RHOAI install plan is shaping up like this v1.33.0 -> v2.1.0 -> v2.4.0
(Our OLM wants to update directly to 2.4 but seeing as starting at version 2.1 we have big change lists it's not a bad idea to make sure all our ope stuff is still working at v2.1.0). We want to make it to at least v2.5 to gain features like nbgitpuller/acceleratorprofile so let's reassess after these couple bumps in version.

Phase2 [May 3rd - May 10th]: install and config RHOAI cluster

Phase3 [May 10th - TBD]: #562 test the functionality of the new cluster and explore new features

hpdempsey commented 2 months ago

I asked Gagan to help coordinate this with the OpenShift AI group, because he was already working with them on running pre-GA release versions of RHOA anyway. I can't assign him to the ticket (yet) because he isn't a member of this repo.

hpdempsey commented 2 months ago

I need a rough cut at this NLT 4/29 because we have a program call with the RHOAI team on 4/30.

DanNiESh commented 2 months ago

@hpdempsey I've updated a rough roadmap, please remind me if I'm missing anything. @dystewart Could you add more details to Phase 2 install cluster?

hpdempsey commented 2 months ago

You mean you updated the roadmap in this issue description @DanNiESh ?

hpdempsey commented 2 months ago

I found out that we will also need a new version of RHEL for this (discuss details offline), so that's another step of integration and test.

DanNiESh commented 2 months ago

You mean you updated the roadmap in this issue description @DanNiESh ?

Yes. I updated it in the issue description.

dystewart commented 2 months ago

@hpdempsey @DanNiESh I've made the beginning of an update plan in Phase 1 above and left a small addition to Phase 2 re actual upgrade install info

joachimweyl commented 1 month ago

@gagansk please accept the GitHub invite so I can add you to this issue.

joachimweyl commented 2 weeks ago

@dystewart what are the next steps for this issue?

DanNiESh commented 6 days ago

Close it as draft 1. If we have new plans that need to add to the roadmap, we will create a v2 draft and copy & paste everything here.