canonical / traefik-k8s-operator

https://charmhub.io/traefik-k8s
Apache License 2.0
11 stars 26 forks source link

error on `start` #393

Open PietroPasotti opened 2 months ago

PietroPasotti commented 2 months ago

Bug Description

traefik errors out on 'start'

To Reproduce

juju deploy traefik-k8s on a bogged-down machine

Environment

was deploying on:

on my laptop. So probably a race of some sort we don't usually get.

Relevant log output

return callable(*args, **kwargs)  # type: ignore                                                               
  File "./src/charm.py", line 609, in _process_status_and_configurations                                           
    self._update_ingress_configurations()                                                                          
  File "/var/lib/juju/agents/unit-traefik-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 544, in wrapped_f
unction                                                                                                            
    return callable(*args, **kwargs)  # type: ignore                                                               
  File "./src/charm.py", line 624, in _update_ingress_configurations                                               
    self._clear_all_configs_and_restart_traefik()                                                                  
  File "/var/lib/juju/agents/unit-traefik-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 544, in wrapped_f
unction                                                                                                            
    return callable(*args, **kwargs)  # type: ignore                                                               
  File "./src/charm.py", line 514, in _clear_all_configs_and_restart_traefik                                       
    self.traefik.delete_dynamic_configs()                                                                          
  File "/var/lib/juju/agents/unit-traefik-0/charm/src/traefik.py", line 640, in delete_dynamic_configs             
    self._container.exec(["find", DYNAMIC_CONFIG_DIR, "-name", "*.yaml", "-delete"])                               
  File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/model.py", line 2865, in exec                           
    return self._pebble.exec(                                                                                      
  File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/pebble.py", line 2763, in exec                          
    raise ChangeError(change.err, change) from e                                                                   
ops.pebble.ChangeError: cannot perform the following tasks:                                                        
- Execute command "find" (exec 22: timeout waiting for websocket connections: context deadline exceeded)           
----- Logs from task 0 -----                                                                                       
2024-08-21T11:37:55Z ERROR exec 22: timeout waiting for websocket connections: context deadline exceeded

Additional context

happened shortly after I opened a new ssh session into the vm. A bunch of containers were lost and a few other charms were in error too. status resolved itself after a few minutes

Abuelodelanada commented 1 month ago

Do we need a can_connect guard?

https://github.com/canonical/loki-k8s-operator/pull/442/files#diff-b9ed39bbc9c0387bd3e07da31d13373745534a1cd723d3e292c73496b12e307cR555