eclipse-ankaios / ankaios

Eclipse Ankaios provides workload and container orchestration for automotive High Performance Computing (HPC) software.
https://eclipse-ankaios.github.io/ankaios/
Apache License 2.0
60 stars 22 forks source link

Fix traces when creating/replacing/deleting workloads #283

Closed krucod3 closed 4 months ago

krucod3 commented 4 months ago

Currently the traces from the agent when creating/replacing/resuming workloads are repetitive and confusing. Here one example:

[2024-06-06T13:45:05Z INFO  ank_agent::agent_manager] Awaiting commands from the server ...
[2024-06-06T13:45:05Z INFO  ank_agent::runtime_manager] Received a new desired state with '2' added and '0' deleted workloads.
[2024-06-06T13:45:05Z INFO  ank_agent::runtime_manager] Found '0' existing 'podman-kube' workload(s).
[2024-06-06T13:45:06Z INFO  ank_agent::runtime_manager] Found '2' existing 'podman' workload(s).
[2024-06-06T13:45:06Z INFO  ank_agent::runtime_manager] Deleting existing workload 'nginx_from_manifest2'. It is created when its dependencies are fulfilled.
[2024-06-06T13:45:06Z INFO  ank_agent::runtime_connectors::runtime_facade] Deleting 'podman' workload 'nginx_from_manifest2' on agent 'agent_A'
[2024-06-06T13:45:06Z INFO  ank_agent::runtime_manager] Resuming workload 'nginx_from_manifest1'
[2024-06-06T13:45:06Z INFO  ank_agent::runtime_connectors::runtime_facade] Resuming 'podman' workload 'nginx_from_manifest1'.
[2024-06-06T13:45:06Z INFO  ank_agent::runtime_connectors::runtime_facade] Creating 'podman' workload 'nginx_from_manifest2'.
[2024-06-06T13:45:06Z INFO  ank_agent::runtime_connectors::podman::podman_runtime] Podman has returned error 'Execution of command failed: Error: creating container storage: the container name "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A" is already in use by 16946c51a9df4bf578a6d96e85f50e4aa1c10d0d60b962a7107d1e49c1c3b50d. You have to remove that container to be able to reuse that name: that name is already in use, or use --replace to instruct Podman to do so.
    ', deleting broken container.
[2024-06-06T13:45:06Z INFO  ank_agent::workload::workload_control_loop] Failed to create workload: 'nginx_from_manifest2': 'Execution of command failed: Error: creating container storage: the container name "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A" is already in use by 16946c51a9df4bf578a6d96e85f50e4aa1c10d0d60b962a7107d1e49c1c3b50d. You have to remove that container to be able to reuse that name: that name is already in use, or use --replace to instruct Podman to do so.
    '

According to the traces

The traces make the impression that the workload which failed during creation is not created at all. They also do not provide the reason why the workload is created again.

Looking at the debug traces answers some questions:

[2024-06-06T13:46:28Z DEBUG ank_agent] Starting the Ankaios agent with 
        name: 'agent_A', 
        server url: 'http://127.0.0.1:25551/', 
        run directory: '/tmp/ankaios/'
[2024-06-06T13:46:28Z DEBUG grpc::client] gRPC Communication Client starts.
[2024-06-06T13:46:28Z INFO  ank_agent::agent_manager] Awaiting commands from the server ...
[2024-06-06T13:46:28Z DEBUG ank_agent::agent_manager] Process command received from server.
[2024-06-06T13:46:28Z DEBUG ank_agent::agent_manager] Agent 'agent_A' received UpdateWorkload:
        Added workloads: [WorkloadSpec { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest1", id: "7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d" }, tags: [], dependencies: {}, restart_policy: Never, runtime: "podman", runtime_config: "image: docker.io/nginx:latest\ncommandOptions: [\"-p\", \"8081:80\"]\n" }, WorkloadSpec { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest2", id: "cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3" }, tags: [], dependencies: {}, restart_policy: Never, runtime: "podman", runtime_config: "image: docker.io/nginx:latest\ncommandOptions: [\"-p\", \"8082:80\"]\n" }]
        Deleted workloads: []
[2024-06-06T13:46:28Z INFO  ank_agent::runtime_manager] Received a new desired state with '2' added and '0' deleted workloads.
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_manager] Handling initial workload list.
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_connectors::runtime_facade] Searching for reusable 'podman-kube' workloads on agent 'agent_A'.
[2024-06-06T13:46:28Z INFO  ank_agent::runtime_manager] Found '0' existing 'podman-kube' workload(s).
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_connectors::runtime_facade] Searching for reusable 'podman' workloads on agent 'agent_A'.
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Found 2 reusable workload(s): '["nginx_from_manifest1.7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d.agent_A", "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A"]'
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_connectors::podman_cli] Listing workload ids for: name='nginx_from_manifest1.7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d.agent_A'
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Found an id for workload 'nginx_from_manifest1.7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d.agent_A': '0ed6c5e4d47a810a40b5ae6cba741db515297905a2a4839b9babcbfeb606b1e1'
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_connectors::podman_cli] Listing workload ids for: name='nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A'
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Found an id for workload 'nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A': 'a97fed9cede772389f1d57b30fc9d3e72f5f26b2010f665bf4d514b6c3a6f6fe'
[2024-06-06T13:46:28Z INFO  ank_agent::runtime_manager] Found '2' existing 'podman' workload(s).
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_manager] Creating control interface pipes for 'WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest1", id: "7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d" }'
[2024-06-06T13:46:28Z INFO  ank_agent::runtime_manager] Resuming workload 'nginx_from_manifest1'
[2024-06-06T13:46:28Z INFO  ank_agent::runtime_connectors::runtime_facade] Resuming 'podman' workload 'nginx_from_manifest1'.
[2024-06-06T13:46:28Z INFO  ank_agent::runtime_manager] Deleting existing workload 'nginx_from_manifest2'. It is created when its dependencies are fulfilled.
[2024-06-06T13:46:28Z INFO  ank_agent::runtime_connectors::runtime_facade] Deleting 'podman' workload 'nginx_from_manifest2' on agent 'agent_A'
[2024-06-06T13:46:28Z INFO  ank_agent::runtime_connectors::runtime_facade] Creating 'podman' workload 'nginx_from_manifest2'.
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_connectors::podman_cli] Listing workload ids for: name='nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A'
[2024-06-06T13:46:28Z DEBUG ank_agent::workload::workload_control_loop] Received WorkloadCommand::Resume.
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_connectors::podman_cli] Listing workload ids for: name='nginx_from_manifest1.7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d.agent_A'
[2024-06-06T13:46:28Z DEBUG ank_agent::workload::workload_control_loop] Received WorkloadCommand::Create.
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_connectors::podman_cli] Creating the workload 'nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A' with image 'docker.io/nginx:latest'
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_connectors::podman_cli] The args are: '["run", "--detach", "--name", "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A", "-p", "8082:80", "--mount=type=bind,source=/tmp/ankaios/agent_A_io/nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3,destination=/run/ankaios/control_interface", "--label=name=nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A", "--label=agent=agent_A", "docker.io/nginx:latest"]'
[2024-06-06T13:46:28Z DEBUG ank_agent::agent_manager] Storing and forwarding local workload state 'WorkloadState { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest2", id: "cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3" }, execution_state: ExecutionState { state: Pending(Starting), additional_info: "Triggered at runtime." } }'.
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Found an id for workload 'nginx_from_manifest1.7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d.agent_A': '0ed6c5e4d47a810a40b5ae6cba741db515297905a2a4839b9babcbfeb606b1e1'
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Starting the checker for the workload 'nginx_from_manifest1.7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d.agent_A' with internal id '0ed6c5e4d47a810a40b5ae6cba741db515297905a2a4839b9babcbfeb606b1e1'
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Found an id for workload 'nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A': 'a97fed9cede772389f1d57b30fc9d3e72f5f26b2010f665bf4d514b6c3a6f6fe'
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Deleting workload with id 'a97fed9cede772389f1d57b30fc9d3e72f5f26b2010f665bf4d514b6c3a6f6fe'
[2024-06-06T13:46:28Z DEBUG ank_agent::generic_polling_state_checker] The workload nginx_from_manifest1 has changed its state to ExecutionState { state: Running(Ok), additional_info: "" }
[2024-06-06T13:46:28Z DEBUG ank_agent::agent_manager] Storing and forwarding local workload state 'WorkloadState { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest1", id: "7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d" }, execution_state: ExecutionState { state: Running(Ok), additional_info: "" } }'.
[2024-06-06T13:46:28Z INFO  ank_agent::runtime_connectors::podman::podman_runtime] Podman has returned error 'Execution of command failed: Error: creating container storage: the container name "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A" is already in use by a97fed9cede772389f1d57b30fc9d3e72f5f26b2010f665bf4d514b6c3a6f6fe. You have to remove that container to be able to reuse that name: that name is already in use, or use --replace to instruct Podman to do so.
    ', deleting broken container.
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] The broken container has been deleted successfully
[2024-06-06T13:46:28Z INFO  ank_agent::workload::workload_control_loop] Failed to create workload: 'nginx_from_manifest2': 'Execution of command failed: Error: creating container storage: the container name "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A" is already in use by a97fed9cede772389f1d57b30fc9d3e72f5f26b2010f665bf4d514b6c3a6f6fe. You have to remove that container to be able to reuse that name: that name is already in use, or use --replace to instruct Podman to do so.
    '
[2024-06-06T13:46:28Z DEBUG ank_agent::workload::workload_control_loop] Received WorkloadCommand::Retry.
[2024-06-06T13:46:28Z DEBUG ank_agent::workload::workload_control_loop] Next retry attempt.
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_connectors::podman_cli] Creating the workload 'nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A' with image 'docker.io/nginx:latest'
[2024-06-06T13:46:28Z DEBUG ank_agent::runtime_connectors::podman_cli] The args are: '["run", "--detach", "--name", "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A", "-p", "8082:80", "--mount=type=bind,source=/tmp/ankaios/agent_A_io/nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3,destination=/run/ankaios/control_interface", "--label=name=nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A", "--label=agent=agent_A", "docker.io/nginx:latest"]'
[2024-06-06T13:46:28Z DEBUG ank_agent::agent_manager] Storing and forwarding local workload state 'WorkloadState { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest2", id: "cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3" }, execution_state: ExecutionState { state: Pending(StartingFailed), additional_info: "Execution of command failed: Error: creating container storage: the container name \"nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A\" is already in use by a97fed9cede772389f1d57b30fc9d3e72f5f26b2010f665bf4d514b6c3a6f6fe. You have to remove that container to be able to reuse that name: that name is already in use, or use --replace to instruct Podman to do so.\n" } }'.
[2024-06-06T13:46:28Z DEBUG ank_agent::agent_manager] Storing and forwarding local workload state 'WorkloadState { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest2", id: "cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3" }, execution_state: ExecutionState { state: Pending(Starting), additional_info: "Triggered at runtime." } }'.
[2024-06-06T13:46:29Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] The workload 'nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A' has been created with internal id '3f448c0da0b57d3b7ee6659a13a1bf3a2755f26d9f2393f9d07073d5f0652f17'
[2024-06-06T13:46:29Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Starting the checker for the workload 'nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A' with internal id '3f448c0da0b57d3b7ee6659a13a1bf3a2755f26d9f2393f9d07073d5f0652f17'
[2024-06-06T13:46:29Z DEBUG ank_agent::workload::workload_control_loop] Created workload 'nginx_from_manifest2' successfully.
[2024-06-06T13:46:29Z DEBUG ank_agent::generic_polling_state_checker] The workload nginx_from_manifest2 has changed its state to ExecutionState { state: Running(Ok), additional_info: "" }
[2024-06-06T13:46:29Z DEBUG ank_agent::agent_manager] Storing and forwarding local workload state 'WorkloadState { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest2", id: "cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3" }, execution_state: ExecutionState { state: Running(Ok), additional_info: "" } }'.

Current Behavior

Traces are confusing

Expected Behavior

Traces explain what if done by the agent

Steps to Reproduce

  1. start Ankaios
  2. create two running workloads
  3. stop one container via Podman
  4. retarts agent

Context (Environment)

Ankaios build from main

Logs

Additional Information

Final result

To be filled by the one closing the issue.

krucod3 commented 4 months ago

After updating the traces:

[2024-06-07T06:42:14Z INFO  ank_agent::agent_manager] Awaiting commands from the server ...
[2024-06-07T06:42:14Z INFO  ank_agent::runtime_manager] Received a new desired state with '2' added and '0' deleted workloads.
[2024-06-07T06:42:14Z INFO  ank_agent::runtime_manager] Found '2' existing 'podman' workload(s).
[2024-06-07T06:42:14Z INFO  ank_agent::runtime_manager] Resuming workload 'nginx_from_manifest1'
[2024-06-07T06:42:14Z INFO  ank_agent::runtime_manager] Replacing existing workload 'nginx_from_manifest2'.
[2024-06-07T06:42:15Z INFO  ank_agent::runtime_manager] Found '0' existing 'podman-kube' workload(s).
[2024-06-07T06:42:15Z INFO  ank_agent::workload::workload_control_loop] Retrying workload creation for: 'nginx_from_manifest2'. Error: 'Execution of command failed: Error: creating container storage: the container name "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A" is already in use by 5ac804d76070b698e7d7d229b1b9c9af178043af276cfad4eefbc5772a0a93e1. You have to remove that container to be able to reuse that name: that name is already in use, or use --replace to instruct Podman to do so.'
[2024-06-07T06:42:15Z INFO  ank_agent::workload::workload_control_loop] Successfully created workload 'nginx_from_manifest2'.

Here also with debug on:

[2024-06-07T06:44:29Z DEBUG ank_agent] Starting the Ankaios agent with 
        name: 'agent_A', 
        server url: 'http://127.0.0.1:25551/', 
        run directory: '/tmp/ankaios/'
[2024-06-07T06:44:29Z INFO  ank_agent::agent_manager] Awaiting commands from the server ...
[2024-06-07T06:44:29Z DEBUG grpc::client] gRPC Communication Client starts.
[2024-06-07T06:44:29Z DEBUG ank_agent::agent_manager] Process command received from server.
[2024-06-07T06:44:29Z DEBUG ank_agent::agent_manager] Agent 'agent_A' received UpdateWorkload:
        Added workloads: [WorkloadSpec { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest1", id: "7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d" }, tags: [], dependencies: {}, restart_policy: Never, runtime: "podman", runtime_config: "image: docker.io/nginx:latest\ncommandOptions: [\"-p\", \"8081:80\"]\n" }, WorkloadSpec { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest2", id: "cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3" }, tags: [], dependencies: {}, restart_policy: Never, runtime: "podman", runtime_config: "image: docker.io/nginx:latest\ncommandOptions: [\"-p\", \"8082:80\"]\n" }]
        Deleted workloads: []
[2024-06-07T06:44:29Z INFO  ank_agent::runtime_manager] Received a new desired state with '2' added and '0' deleted workloads.
[2024-06-07T06:44:29Z DEBUG ank_agent::runtime_manager] Handling initial workload list.
[2024-06-07T06:44:29Z DEBUG ank_agent::runtime_connectors::runtime_facade] Searching for reusable 'podman' workloads on agent 'agent_A'.
[2024-06-07T06:44:29Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Found 2 reusable workload(s): '["nginx_from_manifest1.7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d.agent_A", "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A"]'
[2024-06-07T06:44:29Z DEBUG ank_agent::runtime_connectors::podman_cli] Listing workload ids for: name='nginx_from_manifest1.7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d.agent_A'
[2024-06-07T06:44:29Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Found an id for workload 'nginx_from_manifest1.7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d.agent_A': '0ed6c5e4d47a810a40b5ae6cba741db515297905a2a4839b9babcbfeb606b1e1'
[2024-06-07T06:44:29Z DEBUG ank_agent::runtime_connectors::podman_cli] Listing workload ids for: name='nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Found an id for workload 'nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A': '39496fe96ad874a9b2b4ae15e7ccf9cdf507d2d2613c12ea02457f8cd8960d96'
[2024-06-07T06:44:30Z INFO  ank_agent::runtime_manager] Found '2' existing 'podman' workload(s).
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_manager] Creating control interface pipes for 'WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest1", id: "7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d" }'
[2024-06-07T06:44:30Z INFO  ank_agent::runtime_manager] Resuming workload 'nginx_from_manifest1'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::runtime_facade] Resuming 'podman' workload 'nginx_from_manifest1'.
[2024-06-07T06:44:30Z INFO  ank_agent::runtime_manager] Replacing existing workload 'nginx_from_manifest2'.
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::runtime_facade] Deleting 'podman' workload 'nginx_from_manifest2' on agent 'agent_A'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::runtime_facade] Searching for reusable 'podman-kube' workloads on agent 'agent_A'.
[2024-06-07T06:44:30Z DEBUG ank_agent::workload::workload_control_loop] Received WorkloadCommand::Resume.
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman_cli] Listing workload ids for: name='nginx_from_manifest1.7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d.agent_A'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman_cli] Listing workload ids for: name='nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Found an id for workload 'nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A': '39496fe96ad874a9b2b4ae15e7ccf9cdf507d2d2613c12ea02457f8cd8960d96'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Deleting workload with id '39496fe96ad874a9b2b4ae15e7ccf9cdf507d2d2613c12ea02457f8cd8960d96'
[2024-06-07T06:44:30Z INFO  ank_agent::runtime_manager] Found '0' existing 'podman-kube' workload(s).
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::runtime_facade] Creating 'podman' workload 'nginx_from_manifest2'.
[2024-06-07T06:44:30Z DEBUG ank_agent::workload::workload_control_loop] Received WorkloadCommand::Create.
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman_cli] Creating the workload 'nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A' with image 'docker.io/nginx:latest'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman_cli] The args are: '["run", "--detach", "--name", "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A", "-p", "8082:80", "--mount=type=bind,source=/tmp/ankaios/agent_A_io/nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3,destination=/run/ankaios/control_interface", "--label=name=nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A", "--label=agent=agent_A", "docker.io/nginx:latest"]'
[2024-06-07T06:44:30Z DEBUG ank_agent::agent_manager] Storing and forwarding local workload state 'WorkloadState { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest2", id: "cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3" }, execution_state: ExecutionState { state: Pending(Starting), additional_info: "Triggered at runtime." } }'.
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Found an id for workload 'nginx_from_manifest1.7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d.agent_A': '0ed6c5e4d47a810a40b5ae6cba741db515297905a2a4839b9babcbfeb606b1e1'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Starting the checker for the workload 'nginx_from_manifest1.7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d.agent_A' with internal id '0ed6c5e4d47a810a40b5ae6cba741db515297905a2a4839b9babcbfeb606b1e1'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Creating container failed, cleaning up. Error: 'Execution of command failed: Error: creating container storage: the container name "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A" is already in use by 39496fe96ad874a9b2b4ae15e7ccf9cdf507d2d2613c12ea02457f8cd8960d96. You have to remove that container to be able to reuse that name: that name is already in use, or use --replace to instruct Podman to do so.'
[2024-06-07T06:44:30Z DEBUG ank_agent::generic_polling_state_checker] The workload nginx_from_manifest1 has changed its state to ExecutionState { state: Running(Ok), additional_info: "" }
[2024-06-07T06:44:30Z DEBUG ank_agent::agent_manager] Storing and forwarding local workload state 'WorkloadState { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest1", id: "7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d" }, execution_state: ExecutionState { state: Running(Ok), additional_info: "" } }'.
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] The broken container has been deleted successfully
[2024-06-07T06:44:30Z INFO  ank_agent::workload::workload_control_loop] Retrying workload creation for: 'nginx_from_manifest2'. Error: 'Execution of command failed: Error: creating container storage: the container name "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A" is already in use by 39496fe96ad874a9b2b4ae15e7ccf9cdf507d2d2613c12ea02457f8cd8960d96. You have to remove that container to be able to reuse that name: that name is already in use, or use --replace to instruct Podman to do so.'
[2024-06-07T06:44:30Z DEBUG ank_agent::workload::workload_control_loop] Received WorkloadCommand::Retry.
[2024-06-07T06:44:30Z DEBUG ank_agent::workload::workload_control_loop] Next retry attempt.
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman_cli] Creating the workload 'nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A' with image 'docker.io/nginx:latest'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman_cli] The args are: '["run", "--detach", "--name", "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A", "-p", "8082:80", "--mount=type=bind,source=/tmp/ankaios/agent_A_io/nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3,destination=/run/ankaios/control_interface", "--label=name=nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A", "--label=agent=agent_A", "docker.io/nginx:latest"]'
[2024-06-07T06:44:30Z DEBUG ank_agent::agent_manager] Storing and forwarding local workload state 'WorkloadState { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest2", id: "cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3" }, execution_state: ExecutionState { state: Pending(StartingFailed), additional_info: "Execution of command failed: Error: creating container storage: the container name \"nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A\" is already in use by 39496fe96ad874a9b2b4ae15e7ccf9cdf507d2d2613c12ea02457f8cd8960d96. You have to remove that container to be able to reuse that name: that name is already in use, or use --replace to instruct Podman to do so." } }'.
[2024-06-07T06:44:30Z DEBUG ank_agent::agent_manager] Storing and forwarding local workload state 'WorkloadState { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest2", id: "cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3" }, execution_state: ExecutionState { state: Pending(Starting), additional_info: "Triggered at runtime." } }'.
[2024-06-07T06:44:31Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] The workload 'nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A' has been created with internal id '4a172c98c62fe8947415bed60e204c6f79d91dfbe9ad83af3355322581e1ea11'
[2024-06-07T06:44:31Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Starting the checker for the workload 'nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A' with internal id '4a172c98c62fe8947415bed60e204c6f79d91dfbe9ad83af3355322581e1ea11'
[2024-06-07T06:44:31Z INFO  ank_agent::workload::workload_control_loop] Successfully created workload 'nginx_from_manifest2'.
[2024-06-07T06:44:31Z DEBUG ank_agent::generic_polling_state_checker] The workload nginx_from_manifest2 has changed its state to ExecutionState { state: Running(Ok), additional_info: "" }
[2024-06-07T06:44:31Z DEBUG ank_agent::agent_manager] Storing and forwarding local workload state 'WorkloadState { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest2", id: "cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3" }, execution_state: ExecutionState { state: Running(Ok), additional_info: "" } }'.
krucod3 commented 4 months ago

Done.