opendatahub-io / opendatahub-operator

Open Data Hub operator to manage ODH component integrations
https://opendatahub.io
Apache License 2.0
60 stars 139 forks source link

feat: add managed model registry prometheus config handling logic, fixes RHOAIENG-4273 #1150

Closed dhirajsb closed 3 months ago

dhirajsb commented 3 months ago

Description

Adds managed model registry prometheus config handling logic Fixes RHOAIENG-4273

https://issues.redhat.com/browse/RHOAIENG-4273

How Has This Been Tested?

Tested that the code changes do not affect ODH unit tests. Further testing of the actual prometheus config will continue for other PR https://github.com/red-hat-data-services/rhods-operator/pull/318 But no component code changes will be needed for any changes in the prometheus config.

Screenshot or short clip

Merge criteria

dhirajsb commented 3 months ago

WIP, working on changes to model registry operator metrics service configuration. I need to locally test model registry to verify whether extra manifests are neeeded in prometheus/metrics/apps.

dhirajsb commented 3 months ago

@ykaliuta is there an easy way to combine the config logic from this PR and prometheus config in the PR https://github.com/red-hat-data-services/rhods-operator/pull/318 to test them together? Or, should I manually setup the prometheus config in a local cluster to verify?

ykaliuta commented 3 months ago

@ykaliuta is there an easy way to combine the config logic from this PR and prometheus config in the PR red-hat-data-services#318 to test them together? Or, should I manually setup the prometheus config in a local cluster to verify?

Do you mean to have a branch with all the changes? Rebase 1150 on top of 318, it depends of it. Basically, they should come as one patchset in one PR if we did not have that squashing policy. But even squashing them in one patch does not look a problem for me in this case.

Or I'm missing something?

zdtsw commented 3 months ago

/LGTM

I am fine with this change. @dhirajsb I said in the offline chat. as long as the logic wont break ODH build,and regression, it can get into the code base.

But you two have started discussion of how it should be tested^ so I leave it to you to finalize that before the final call.

In general, I would say, this is a tricky topic: as we have not decided how should this specific monitoring be handled in ODH. Plus, it gets a bit challenge to fully test it by Devel. One thing can be done as @ykaliuta mentioned, cherry-pick this change onto downstream's PR and make build from there to test. or vice versa.

ykaliuta commented 3 months ago

Or I'm missing something?

Yes, different branches :) But anyway, as @zdtsw said.

dhirajsb commented 3 months ago

@zdtsw let's merge this boilerplate code for odh 2.16 since it's exactly the same as all other rhoai components.

I'll continue working on testing the actual prometheus yaml config on the other RHDS PR https://github.com/red-hat-data-services/rhods-operator/pull/318

dhirajsb commented 3 months ago

/retest

openshift-ci[bot] commented 3 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: zdtsw

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/opendatahub-io/opendatahub-operator/blob/incubation/OWNERS)~~ [zdtsw] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
zdtsw commented 3 months ago

@VaishnaviHire + @dhirajsb FYI, i wont sync this PR into downstream 2.12 for now. we can let it be in ODH 2.16 first and get the downstream config changes later in nightly along with this PR.