canonical / grafana-agent-k8s-operator

https://charmhub.io/grafana-agent-k8s
Apache License 2.0
8 stars 18 forks source link

[Machine Charm] Undeterministic when there is another subordinate charm. #210

Closed simskij closed 1 year ago

simskij commented 1 year ago

Enhancement Proposal

Currently, the charm expects there to exactly one principal relation. However, in the case of co-located subordinates, this is no longer true. To allow for this, the charm needs to properly handle and merge configuration provided both by the actual principal and any co-deployed subordinates.

The deployment scenario illustrated as a diagram: 20230615_15h01m14s_grim

dstathis commented 1 year ago

Changed the issue name to be a bit more clear.

sed-i commented 1 year ago

Probably related: https://github.com/canonical/operator/issues/945

dstathis commented 1 year ago

@simskij @sed-i

I think we are approaching this the wrong way. Correct me if I am wrong but as I understand it, this issue is really "make grafana-agent work with hardware-health". To this end I have two suggestions.

  1. Hardware Health is a charm that monitors the hardware of the machine. This sounds like something that grafana-agent would be excellent at. We should build the checks from hardware health in to grafana-agent.
  2. If option 1 is not possible, we should consider just using a magic directory for communication. Inter-subordinate communication is extremely complicated and behavior will likely never be well defined in Juju documentation.
sed-i commented 1 year ago

@dstathis

  1. gagent is a general purpose charm and hw-health is very specific. Even if we converted hw-health to a snap or to a standalone exporter, we'd still need something other than the principal and gagent to deploy it, relate it, ... back to the original problem.
  2. Do you mean via a textfile collector? But even this way, wouldn't we need a relation between hw-health and gagent to tell it about the plugs/filename/..? And until chained subords are implemented, we'd also need the relation between hw-helath to the principal.
dstathis commented 1 year ago

@dstathis

  1. gagent is a general purpose charm and hw-health is very specific. Even if we converted hw-health to a snap or to a standalone exporter, we'd still need something other than the principal and gagent to deploy it, relate it, ... back to the original problem.
  2. Do you mean via a textfile collector? But even this way, wouldn't we need a relation between hw-health and gagent to tell it about the plugs/filename/..? And until chained subords are implemented, we'd also need the relation between hw-helath to the principal.
  1. If hw-health was a snap, it could easily be deployed by grafana-agent itself when enabled. This would allow for tight reliable integration. I'm not familiar with everything hw-health is supposed to do. Perhaps it would be appropriate as an optional feature to grafana-agent or perhaps not.

  2. You would not need a relation as far as I can tell. Just have hw-health always write to the same file. Something unique enough like /var/<model-uuid>/hw-health-charm/metrics. Additionally we could communicate whatever we want with files in that directory. If the relation always happens between these exact two charms, and always on the same machine, there's really no reason why we need to go through juju for communication.

  3. While 2 might seem "not juju-y", I think the ambiguity of relating the two charms will make for a very unreliable solution. Even if we figure out in detail how multiple subordinates relate to each other, it will be based on testing rather than any specification or documentation. Which would mean the behavior could feasibly change with any Juju release.