canonical / grafana-agent-k8s-operator

This charmed operator automates the operational procedures of running Grafana Agent, an open-soruce telemetry collector.
https://charmhub.io/grafana-agent-k8s
Apache License 2.0
8 stars 18 forks source link

Invalid cos-agent relation data raises instead of blocking #172

Closed sed-i closed 1 year ago

sed-i commented 1 year ago

Bug Description

The new version of cos-agent lib switched to a new schema + pydantic validation. As a result, charm code raises and the unit is in error state. Instead, it should probably block.

To Reproduce

Relate gagent to zookeeper (revision 96).

Environment

Model                    Controller  Cloud/Region         Version  SLA          Timestamp
test-machine-agent-2za1  lxd         localhost/localhost  2.9.42   unsupported  10:29:04-04:00

App                  Version  Status  Scale  Charm          Channel        Rev  Exposed  Message
agent                         error       4  grafana-agent  edge             7  no       hook failed: "cos-agent-relation-joined"
principal-cos-agent           active      2  zookeeper      edge            96  no       
principal-juju-info  22.04    active      2  ubuntu         latest/stable   22  no       

Relevant log output

unit-agent-2: 10:23:10.930 DEBUG unit.agent/2.juju-log cos-agent:4: Emitting Juju event cos_agent_relation_joined.
unit-agent-2: 10:23:11.028 ERROR unit.agent/2.juju-log cos-agent:4: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-agent-2/charm/./src/charm.py", line 473, in <module>
    main(GrafanaAgentMachineCharm)
  File "/var/lib/juju/agents/unit-agent-2/charm/venv/ops/main.py", line 441, in main
    _emit_charm_event(charm, dispatcher.event_name)
  File "/var/lib/juju/agents/unit-agent-2/charm/venv/ops/main.py", line 149, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-agent-2/charm/venv/ops/framework.py", line 354, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-agent-2/charm/venv/ops/framework.py", line 830, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-agent-2/charm/venv/ops/framework.py", line 919, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-agent-2/charm/lib/charms/grafana_agent/v0/cos_agent.py", line 477, in _on_relation_data_changed
    provider_data = CosAgentProviderUnitData(**json.loads(raw))
  File "/var/lib/juju/agents/unit-agent-2/charm/venv/pydantic/main.py", line 341, in __init__
    raise validation_error
pydantic.error_wrappers.ValidationError: 5 validation errors for CosAgentProviderUnitData
metrics_alert_rules
  field required (type=value_error.missing)
log_alert_rules
  field required (type=value_error.missing)
dashboards
  field required (type=value_error.missing)
metrics_scrape_jobs
  field required (type=value_error.missing)
log_slots
  field required (type=value_error.missing)
unit-agent-2: 10:23:11.251 ERROR juju.worker.uniter.operation hook "cos-agent-relation-joined" (via hook dispatching script: dispatch) failed: exit status 1

Additional context

No response

lucabello commented 1 year ago

We should handle the ValidationError with an appropriate try-catch block to block the charm on the validation error, instead of failing like this.