canonical / github-runner-operator

github-runner-operator - charm repository.
Apache License 2.0
9 stars 19 forks source link

Charm with too long name, not able to create runner VM #272

Open rgildein opened 2 months ago

rgildein commented 2 months ago

Bug Description

The charm end up in error state, and it tried to re-create runner VM forever. The logs, did not provide much information (see in log output) why and after trying different configuration, I went to the machine and played with LXD directly, where I found that the issue is too long socket path. Output from lxc on charm.

lxc info --show-log runner-charm-os-service-checks-1-400033f2168e6db04cc9b6c9
Name: runner-charm-os-service-checks-1-400033f2168e6db04cc9b6c9
Status: STOPPED
Type: virtual-machine (ephemeral)
Architecture: x86_64
Created: 2024/04/29 12:07 UTC

Log:

qemu-system-x86_64:/var/snap/lxd/common/lxd/logs/runner-charm-os-service-checks-1-400033f2168e6db04cc9b6c9/qemu.conf:230: UNIX socket path '/var/snap/lxd/common/lxd/logs/runner-charm-os-service-checks-1-400033f2168e6db04cc9b6c9/virtio-fs.config.sock' is too long
Path must be less than 108 bytes

To Reproduce

juju deploy github-runner --constraints="cores=4 mem=8G" --config token=*** --config path=canonical/charm-openstack-service-checks --config virtual-machines=1 --config vm-memory=7GiB --config vm-disk=20GiB --config vm-cpu=4 --channel latest/edge runner-charm-os-service-checks

Environment

latest/edge rev.177

Relevant log output

unit-runner-charm-os-service-checks-1: 12:03:04 ERROR unit.runner-charm-os-service-checks/1.juju-log Retry limit of 5 exceed: Unable to start the LXD instance runner-charm-os-service-checks-1-9c22135094d5b08a1d3fe0e7
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-runner-charm-os-service-checks-1/charm/src/lxd.py", line 206, in start
    self._pylxd_instance.start(timeout, force, wait)
  File "/var/lib/juju/agents/unit-runner-charm-os-service-checks-1/charm/venv/pylxd/models/instance.py", line 380, in start
    return self._set_state("start", timeout=timeout, force=force, wait=wait)
  File "/var/lib/juju/agents/unit-runner-charm-os-service-checks-1/charm/venv/pylxd/models/instance.py", line 365, in _set_state
    self.client.operations.wait_for_operation(response.json()["operation"])
  File "/var/lib/juju/agents/unit-runner-charm-os-service-checks-1/charm/venv/pylxd/models/operation.py", line 57, in wait_for_operation
    operation.wait()
  File "/var/lib/juju/agents/unit-runner-charm-os-service-checks-1/charm/venv/pylxd/models/operation.py", line 94, in wait
    response = self._client.api.operations[self.id].wait.get()
  File "/var/lib/juju/agents/unit-runner-charm-os-service-checks-1/charm/venv/pylxd/client.py", line 207, in get
    self._assert_response(
  File "/var/lib/juju/agents/unit-runner-charm-os-service-checks-1/charm/venv/pylxd/client.py", line 178, in _assert_response
    raise exceptions.LXDAPIException(response)
pylxd.exceptions.LXDAPIException: Failed to run: forklimits limit=memlock:unlimited:unlimited fd=3 fd=4 -- /snap/lxd/28322/bin/qemu-system-x86_64 -S -name runner-charm-os-service-checks-1-9c22135094d5b08a1d3fe0e7 -uuid 4ad84de1-9e5f-4d05-bbf8-af1cebecd5fc -daemonize -cpu host,hv_passthrough -nographic -serial chardev:console -nodefaults -no-user-config -sandbox on,obsolete=deny,elevateprivileges=allow,spawn=allow,resourcecontrol=deny -readconfig /var/snap/lxd/common/lxd/logs/runner-charm-os-service-checks-1-9c22135094d5b08a1d3fe0e7/qemu.conf -spice unix=on,disable-ticketing=on,addr=/var/snap/lxd/common/lxd/logs/runner-charm-os-service-checks-1-9c22135094d5b08a1d3fe0e7/qemu.spice -pidfile /var/snap/lxd/common/lxd/logs/runner-charm-os-service-checks-1-9c22135094d5b08a1d3fe0e7/qemu.pid -D /var/snap/lxd/common/lxd/logs/runner-charm-os-service-checks-1-9c22135094d5b08a1d3fe0e7/qemu.log -smbios type=2,manufacturer=Canonical Ltd.,product=LXD -runas lxd: : exit status 1

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-runner-charm-os-service-checks-1/charm/src/utilities.py", line 79, in fn_with_retry
    return func(*args, **kwargs)
  File "/var/lib/juju/agents/unit-runner-charm-os-service-checks-1/charm/src/runner.py", line 500, in _start_instance
    self.instance.start(wait=True)
  File "/var/lib/juju/agents/unit-runner-charm-os-service-checks-1/charm/src/lxd.py", line 209, in start
    raise LxdError(f"Unable to start the LXD instance {self.name}") from err
errors.LxdError: Unable to start the LXD instance runner-charm-os-service-checks-1-9c22135094d5b08a1d3fe0e7
unit-runner-charm-os-service-checks-1: 12:03:04 ERROR unit.runner-charm-os-service-checks/1.juju-log Unable to create runner: runner-charm-os-service-checks-1-9c22135094d5b08a1d3fe0e7

Additional context

No response

cbartz commented 2 months ago

This seems to be related to https://github.com/canonical/lxd/issues/12539