canonical / grafana-agent-k8s-operator

https://charmhub.io/grafana-agent-k8s
Apache License 2.0
8 stars 18 forks source link

arm64 `ops.pebble.ChangeError` Start service "agent" (cannot start service: fork/exec /bin/agent: exec format error) #309

Open carlcsaposs-canonical opened 3 months ago

carlcsaposs-canonical commented 3 months ago

Bug Description

Deploying charm on arm64 causes

unit-grafana-agent-k8s-0: 09:18:32 DEBUG unit.grafana-agent-k8s/0.juju-log ops 2.13.0 up and running.
unit-grafana-agent-k8s-0: 09:18:32 DEBUG unit.grafana-agent-k8s/0.juju-log Invalid Prometheus alert rules folder at /var/lib/juju/agents/unit-grafana-agent-k8s-0/charm/prometheus_alert_rules: directory does not exist
unit-grafana-agent-k8s-0: 09:18:32 DEBUG unit.grafana-agent-k8s/0.juju-log no relation on 'tracing': tracing not ready
unit-grafana-agent-k8s-0: 09:18:32 DEBUG unit.grafana-agent-k8s/0.juju-log <class '__main__.GrafanaAgentK8sCharm'>.<property object at 0xf79c1987e700> returned None; quietly disabling charm_tracing for the run.
unit-grafana-agent-k8s-0: 09:18:32 DEBUG unit.grafana-agent-k8s/0.juju-log Emitting Juju event agent_pebble_ready.
unit-grafana-agent-k8s-0: 09:18:32 ERROR unit.grafana-agent-k8s/0.juju-log Uncaught exception while in charm code:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-grafana-agent-k8s-0/charm/./src/charm.py", line 261, in <module>
    main(GrafanaAgentK8sCharm)
  File "/var/lib/juju/agents/unit-grafana-agent-k8s-0/charm/venv/ops/main.py", line 544, in main
    manager.run()
  File "/var/lib/juju/agents/unit-grafana-agent-k8s-0/charm/venv/ops/main.py", line 520, in run
    self._emit()
  File "/var/lib/juju/agents/unit-grafana-agent-k8s-0/charm/venv/ops/main.py", line 509, in _emit
    _emit_charm_event(self.charm, self.dispatcher.event_name)
  File "/var/lib/juju/agents/unit-grafana-agent-k8s-0/charm/venv/ops/main.py", line 143, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-grafana-agent-k8s-0/charm/venv/ops/framework.py", line 350, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-grafana-agent-k8s-0/charm/venv/ops/framework.py", line 849, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-grafana-agent-k8s-0/charm/venv/ops/framework.py", line 939, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-grafana-agent-k8s-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 547, in wrapped_function
    return callable(*args, **kwargs)  # type: ignore
  File "/var/lib/juju/agents/unit-grafana-agent-k8s-0/charm/./src/charm.py", line 127, in _on_agent_pebble_ready
    self._container.autostart()
  File "/var/lib/juju/agents/unit-grafana-agent-k8s-0/charm/venv/ops/model.py", line 2149, in autostart
    self._pebble.autostart_services()
  File "/var/lib/juju/agents/unit-grafana-agent-k8s-0/charm/venv/ops/pebble.py", line 1899, in autostart_services
    return self._services_action('autostart', [], timeout, delay)
  File "/var/lib/juju/agents/unit-grafana-agent-k8s-0/charm/venv/ops/pebble.py", line 2001, in _services_action
    raise ChangeError(change.err, change)
ops.pebble.ChangeError: cannot perform the following tasks:
- Start service "agent" (cannot start service: fork/exec /bin/agent: exec format error)
----- Logs from task 0 -----
2024-07-03T09:18:32Z ERROR cannot start service: fork/exec /bin/agent: exec format error
-----
unit-grafana-agent-k8s-0: 09:18:32 ERROR juju.worker.uniter.operation hook "agent-pebble-ready" (via hook dispatching script: dispatch) failed: exit status 1
unit-grafana-agent-k8s-0: 09:18:32 ERROR juju.worker.uniter pebble poll failed for container "agent": failed to send pebble-ready event: hook failed

To Reproduce

  1. Start arm64 Ubuntu 22.04 VM (our full cloud-init: https://github.com/canonical/self-hosted-runner-provisioner-azure/blob/bf2668c6eea97b001d998c9c6be2bbd6e3553a27/cloud_init.sh.jinja#L1-L36)
  2. sudo snap install juju
    sudo apt-get update
    sudo apt-get install retry -y
    sudo snap install microk8s --channel=1.30-strict/stable
    sudo adduser "$USER" 'snap_microk8s'
    newgrp snap_microk8s
    microk8s status --wait-ready
    retry --times 3 --delay 5 -- sudo microk8s enable dns
    microk8s status --wait-ready
    microk8s.kubectl rollout status --namespace kube-system --watch --timeout=5m deployments/coredns
    retry --times 3 --delay 5 -- sudo microk8s enable hostpath-storage
    microk8s.kubectl rollout status --namespace kube-system --watch --timeout=5m deployments/hostpath-provisioner
    mkdir ~/.kube/
    microk8s config > ~/.kube/config
  3. $ snap list
    Name      Version        Rev    Tracking            Publisher   Notes
    core20    20240416       2321   latest/stable       canonical✓  base
    juju      3.5.1          27226  3/stable            canonical✓  -
    lxd       5.0.3-d921d2e  28384  5.0/stable/…        canonical✓  -
    microk8s  v1.30.1        6855   1.30-strict/stable  canonical✓  -
    snapd     2.63           21761  latest/stable       canonical✓  snapd
  4. mkdir -p ~/.local/share/juju
    juju bootstrap microk8s
    juju model-defaults logging-config='<root>=INFO; unit=DEBUG'
    juju add-model test
    juju set-model-constraints arch=arm64
  5. $ juju deploy grafana-agent-k8s
    Deployed "grafana-agent-k8s" from charm-hub charm "grafana-agent-k8s", revision 75 in channel latest/stable on ubuntu@22.04/stable

Environment

arm64 Ubuntu 22.04 (see cloud init above) on microk8s

juju agent 3.5.1

See above for charm revision and snap revisions

Relevant log output

juju-debug-log.txt

Additional context

First encountered on https://github.com/canonical/mysql-router-k8s-operator/actions/runs/9757639606/job/26932906259?pr=278#step:23:4494 (ignore KeyError in logs—that's a separate issue from this, caused by us using outdated lib)

IbraAoad commented 3 months ago

We should check if the revisions are using the arm rock for the agent.

lucabello commented 2 months ago

No new revisions have been released for ARM since a very long time because the ARM section of CI has been broken for 3 months :)