operate-first / apps

Operate-first application manifests
GNU General Public License v3.0
51 stars 137 forks source link

Re-install Observatorium via ACM #365

Closed tumido closed 3 years ago

tumido commented 3 years ago

When we first tried installing Observatorium via ACM we didn't read the docs properly and tried to deploy a Observatorium resource. Docs suggest that ACM has a wrapper on top of Observatorium required for this to work properly:

https://access.redhat.com/documentation/en-us/red_hat_advanced_cluster_management_for_kubernetes/2.2/html/observing_environments/observing-environments-intro#enable-observability

tumido commented 3 years ago

We want to deploy Observatorium on a managed cluster (not a management one, no MultiClusterHub deployed). We may need to follow the docs and do some changes to our Observatorium resource.

Initially we tried subscribing to the ACM operator on that cluster and deploy Observatorium directly. Naturally it didn't work because ACM probably requires MultiClusterObservability to be deployed instead (so the observatorium operator is deployed, etc).

durandom commented 3 years ago

cc @randymgeorge

@4n4nd can elaborate on the requirements that we expect and then we can dig into how much is met by ACM

But in short, we want to be able to store arbitrary metrics long term - not just those from OCP, but also from any workload. Basically Thanos as a service. Same for logs.

tumido commented 3 years ago

:heavy_plus_sign: we want to deploy that on a managed cluster (MOC Zero), not on the management cluster (MOC Infra).

HumairAK commented 3 years ago

Would this mean we need to deploy the MCO CR to have the observatorium operator available? Which, from what I gather from the docs above, will require us to configure some object store for it?

This may sound like a silly question but, at that point, can we not just deploy ACM on zero cluster, deploy MCO, and use that thanos?

randymgeorge commented 3 years ago

Agree on the statement of elaborate on the requirements. The discussion of what gets deployed where is confusing.

On Fri, Mar 12, 2021 at 9:14 AM Humair @.***> wrote:

Would this mean we need to deploy the MCO CR to have the observatorium operator available? Which, from what I gather from the docs above, will require us to configure some object store for it?

This may sound like a silly question but, at that point, can we not just deploy ACM on zero cluster, deploy MCO, and use that thanos?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/operate-first/apps/issues/365#issuecomment-797552992, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG7TKTCMDST4VMPZOP4AQ4TTDIOSTANCNFSM4ZCFLJ4Q .

--

Regards,

Randy George

Distinguished Engineer, Advanced Cluster Management

Red Hat https://www.redhat.com/

@.*** M: +15127511392 https://www.redhat.com/

HumairAK commented 3 years ago

Okay so to clarify, there are 2 clusters:

moc-infra is our management cluster, this is where we deploy acm/argocd. The acm in this cluster manages/deploys the other clusters (of which there is currently only the moc-zero cluster).

moc-zero is where we deploy our apps/services like odh. This is also where we currently have the observatorium operator deployed. We deploy an observatorium CR that provisions Thanos (see here). We were wondering if we can instead use acm for this?

HumairAK commented 3 years ago

The way I understand it we can do one of 2 things:

Does that sound right?

tumido commented 3 years ago

Also MCO requires object storage being set up on the cluster. In case we want to deploy this to moc-infra we would need to either:

randymgeorge commented 3 years ago

Per Marcel, what are the requirements. I am not sure why you are deploying Thanos on the managed cluster. Is this part of the application workload?

On Fri, Mar 12, 2021 at 10:54 AM Tomáš Coufal @.***> wrote:

Also MCO requires object storage being set up on the cluster. In case we want to deploy this to moc-infra we would need to either:

  • Deploy local storage operator + OCS + we probably don't have that amount of storage available in there.
  • Or connect it to storage cluster outside (IMGO better option I think)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/operate-first/apps/issues/365#issuecomment-797619076, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG7TKTD5NHHNMHT6GRS6XMTTDJBNRANCNFSM4ZCFLJ4Q .

--

Regards,

Randy George

Distinguished Engineer, Advanced Cluster Management

Red Hat https://www.redhat.com/

@.*** M: +15127511392 https://www.redhat.com/

randymgeorge commented 3 years ago

It sounds like this is just part of the Open Data Hub workload and not related to operations. In that case, I would deploy it like any other application content. That can be via ACM or via ArgoCD. It was my misunderstanding. I thought that you were deploying Observatorium for the management which ACM already deploys an instance.

tumido commented 3 years ago

ACM marks the MultiClusterHub as a required resource. Therefore I'm a bit worried we can't setup "only" the Observatorium on a managed cluster, without setting it up as a management cluster as well.

randymgeorge commented 3 years ago

See the github where I commented. This is a use case where observatorium is part of the workload running on the cluster. ACM would not be the right approach.

On Mon, Mar 15, 2021 at 7:52 AM Tomáš Coufal @.***> wrote:

ACM marks the MultiClusterHub as a required resource. Therefore I'm a bit worried we can't setup "only" the Observatorium on a managed cluster, without setting it up as a management cluster as well.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/operate-first/apps/issues/365#issuecomment-799394228, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG7TKTDK6RVXMERB6F3YD5DTDX7I3ANCNFSM4ZCFLJ4Q .

--

Regards,

Randy George

Distinguished Engineer, Advanced Cluster Management

Red Hat https://www.redhat.com/

@.*** M: +15127511392 https://www.redhat.com/

4n4nd commented 3 years ago

Current Monitoring Structure

4n4nd commented 3 years ago

The discussion was to see if we could replace Observatorium operator with ACM and use ACM Operator to deploy the Observatorium CR for MOC Zero Cluster monitoring.

tumido commented 3 years ago

Our target is to provide Observatorium as a service for users on the zero cluster only. Currently we don't have any intention to invest in Multi cluster observability managed from the moc-infra management cluster since it would require a lot of overhead for setting up things like storage on this (bare metal) cluster. The intention behind this issue was to look for an easier way to deploy Observatorium. This easier way appeared to be ACM however this turned out to be not a valid way forward since it requires making zero a management cluster as well. This is a technical obstacle.

If we want to move this forward we have 2 options:

Therefore I'm closing this issue for now, since either path requires discussion on greater scope - ACM is not offering us an "easier way" this time. Feel free to reopen if the status changes.