Closed krucod3 closed 4 months ago
After updating the traces:
[2024-06-07T06:42:14Z INFO ank_agent::agent_manager] Awaiting commands from the server ...
[2024-06-07T06:42:14Z INFO ank_agent::runtime_manager] Received a new desired state with '2' added and '0' deleted workloads.
[2024-06-07T06:42:14Z INFO ank_agent::runtime_manager] Found '2' existing 'podman' workload(s).
[2024-06-07T06:42:14Z INFO ank_agent::runtime_manager] Resuming workload 'nginx_from_manifest1'
[2024-06-07T06:42:14Z INFO ank_agent::runtime_manager] Replacing existing workload 'nginx_from_manifest2'.
[2024-06-07T06:42:15Z INFO ank_agent::runtime_manager] Found '0' existing 'podman-kube' workload(s).
[2024-06-07T06:42:15Z INFO ank_agent::workload::workload_control_loop] Retrying workload creation for: 'nginx_from_manifest2'. Error: 'Execution of command failed: Error: creating container storage: the container name "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A" is already in use by 5ac804d76070b698e7d7d229b1b9c9af178043af276cfad4eefbc5772a0a93e1. You have to remove that container to be able to reuse that name: that name is already in use, or use --replace to instruct Podman to do so.'
[2024-06-07T06:42:15Z INFO ank_agent::workload::workload_control_loop] Successfully created workload 'nginx_from_manifest2'.
Here also with debug on:
[2024-06-07T06:44:29Z DEBUG ank_agent] Starting the Ankaios agent with
name: 'agent_A',
server url: 'http://127.0.0.1:25551/',
run directory: '/tmp/ankaios/'
[2024-06-07T06:44:29Z INFO ank_agent::agent_manager] Awaiting commands from the server ...
[2024-06-07T06:44:29Z DEBUG grpc::client] gRPC Communication Client starts.
[2024-06-07T06:44:29Z DEBUG ank_agent::agent_manager] Process command received from server.
[2024-06-07T06:44:29Z DEBUG ank_agent::agent_manager] Agent 'agent_A' received UpdateWorkload:
Added workloads: [WorkloadSpec { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest1", id: "7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d" }, tags: [], dependencies: {}, restart_policy: Never, runtime: "podman", runtime_config: "image: docker.io/nginx:latest\ncommandOptions: [\"-p\", \"8081:80\"]\n" }, WorkloadSpec { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest2", id: "cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3" }, tags: [], dependencies: {}, restart_policy: Never, runtime: "podman", runtime_config: "image: docker.io/nginx:latest\ncommandOptions: [\"-p\", \"8082:80\"]\n" }]
Deleted workloads: []
[2024-06-07T06:44:29Z INFO ank_agent::runtime_manager] Received a new desired state with '2' added and '0' deleted workloads.
[2024-06-07T06:44:29Z DEBUG ank_agent::runtime_manager] Handling initial workload list.
[2024-06-07T06:44:29Z DEBUG ank_agent::runtime_connectors::runtime_facade] Searching for reusable 'podman' workloads on agent 'agent_A'.
[2024-06-07T06:44:29Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Found 2 reusable workload(s): '["nginx_from_manifest1.7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d.agent_A", "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A"]'
[2024-06-07T06:44:29Z DEBUG ank_agent::runtime_connectors::podman_cli] Listing workload ids for: name='nginx_from_manifest1.7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d.agent_A'
[2024-06-07T06:44:29Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Found an id for workload 'nginx_from_manifest1.7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d.agent_A': '0ed6c5e4d47a810a40b5ae6cba741db515297905a2a4839b9babcbfeb606b1e1'
[2024-06-07T06:44:29Z DEBUG ank_agent::runtime_connectors::podman_cli] Listing workload ids for: name='nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Found an id for workload 'nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A': '39496fe96ad874a9b2b4ae15e7ccf9cdf507d2d2613c12ea02457f8cd8960d96'
[2024-06-07T06:44:30Z INFO ank_agent::runtime_manager] Found '2' existing 'podman' workload(s).
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_manager] Creating control interface pipes for 'WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest1", id: "7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d" }'
[2024-06-07T06:44:30Z INFO ank_agent::runtime_manager] Resuming workload 'nginx_from_manifest1'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::runtime_facade] Resuming 'podman' workload 'nginx_from_manifest1'.
[2024-06-07T06:44:30Z INFO ank_agent::runtime_manager] Replacing existing workload 'nginx_from_manifest2'.
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::runtime_facade] Deleting 'podman' workload 'nginx_from_manifest2' on agent 'agent_A'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::runtime_facade] Searching for reusable 'podman-kube' workloads on agent 'agent_A'.
[2024-06-07T06:44:30Z DEBUG ank_agent::workload::workload_control_loop] Received WorkloadCommand::Resume.
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman_cli] Listing workload ids for: name='nginx_from_manifest1.7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d.agent_A'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman_cli] Listing workload ids for: name='nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Found an id for workload 'nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A': '39496fe96ad874a9b2b4ae15e7ccf9cdf507d2d2613c12ea02457f8cd8960d96'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Deleting workload with id '39496fe96ad874a9b2b4ae15e7ccf9cdf507d2d2613c12ea02457f8cd8960d96'
[2024-06-07T06:44:30Z INFO ank_agent::runtime_manager] Found '0' existing 'podman-kube' workload(s).
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::runtime_facade] Creating 'podman' workload 'nginx_from_manifest2'.
[2024-06-07T06:44:30Z DEBUG ank_agent::workload::workload_control_loop] Received WorkloadCommand::Create.
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman_cli] Creating the workload 'nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A' with image 'docker.io/nginx:latest'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman_cli] The args are: '["run", "--detach", "--name", "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A", "-p", "8082:80", "--mount=type=bind,source=/tmp/ankaios/agent_A_io/nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3,destination=/run/ankaios/control_interface", "--label=name=nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A", "--label=agent=agent_A", "docker.io/nginx:latest"]'
[2024-06-07T06:44:30Z DEBUG ank_agent::agent_manager] Storing and forwarding local workload state 'WorkloadState { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest2", id: "cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3" }, execution_state: ExecutionState { state: Pending(Starting), additional_info: "Triggered at runtime." } }'.
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Found an id for workload 'nginx_from_manifest1.7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d.agent_A': '0ed6c5e4d47a810a40b5ae6cba741db515297905a2a4839b9babcbfeb606b1e1'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Starting the checker for the workload 'nginx_from_manifest1.7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d.agent_A' with internal id '0ed6c5e4d47a810a40b5ae6cba741db515297905a2a4839b9babcbfeb606b1e1'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Creating container failed, cleaning up. Error: 'Execution of command failed: Error: creating container storage: the container name "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A" is already in use by 39496fe96ad874a9b2b4ae15e7ccf9cdf507d2d2613c12ea02457f8cd8960d96. You have to remove that container to be able to reuse that name: that name is already in use, or use --replace to instruct Podman to do so.'
[2024-06-07T06:44:30Z DEBUG ank_agent::generic_polling_state_checker] The workload nginx_from_manifest1 has changed its state to ExecutionState { state: Running(Ok), additional_info: "" }
[2024-06-07T06:44:30Z DEBUG ank_agent::agent_manager] Storing and forwarding local workload state 'WorkloadState { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest1", id: "7d6ea2b79cea1e401beee1553a9d3d7b5bcbb37f1cfdb60db1fbbcaa140eb17d" }, execution_state: ExecutionState { state: Running(Ok), additional_info: "" } }'.
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] The broken container has been deleted successfully
[2024-06-07T06:44:30Z INFO ank_agent::workload::workload_control_loop] Retrying workload creation for: 'nginx_from_manifest2'. Error: 'Execution of command failed: Error: creating container storage: the container name "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A" is already in use by 39496fe96ad874a9b2b4ae15e7ccf9cdf507d2d2613c12ea02457f8cd8960d96. You have to remove that container to be able to reuse that name: that name is already in use, or use --replace to instruct Podman to do so.'
[2024-06-07T06:44:30Z DEBUG ank_agent::workload::workload_control_loop] Received WorkloadCommand::Retry.
[2024-06-07T06:44:30Z DEBUG ank_agent::workload::workload_control_loop] Next retry attempt.
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman_cli] Creating the workload 'nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A' with image 'docker.io/nginx:latest'
[2024-06-07T06:44:30Z DEBUG ank_agent::runtime_connectors::podman_cli] The args are: '["run", "--detach", "--name", "nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A", "-p", "8082:80", "--mount=type=bind,source=/tmp/ankaios/agent_A_io/nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3,destination=/run/ankaios/control_interface", "--label=name=nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A", "--label=agent=agent_A", "docker.io/nginx:latest"]'
[2024-06-07T06:44:30Z DEBUG ank_agent::agent_manager] Storing and forwarding local workload state 'WorkloadState { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest2", id: "cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3" }, execution_state: ExecutionState { state: Pending(StartingFailed), additional_info: "Execution of command failed: Error: creating container storage: the container name \"nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A\" is already in use by 39496fe96ad874a9b2b4ae15e7ccf9cdf507d2d2613c12ea02457f8cd8960d96. You have to remove that container to be able to reuse that name: that name is already in use, or use --replace to instruct Podman to do so." } }'.
[2024-06-07T06:44:30Z DEBUG ank_agent::agent_manager] Storing and forwarding local workload state 'WorkloadState { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest2", id: "cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3" }, execution_state: ExecutionState { state: Pending(Starting), additional_info: "Triggered at runtime." } }'.
[2024-06-07T06:44:31Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] The workload 'nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A' has been created with internal id '4a172c98c62fe8947415bed60e204c6f79d91dfbe9ad83af3355322581e1ea11'
[2024-06-07T06:44:31Z DEBUG ank_agent::runtime_connectors::podman::podman_runtime] Starting the checker for the workload 'nginx_from_manifest2.cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3.agent_A' with internal id '4a172c98c62fe8947415bed60e204c6f79d91dfbe9ad83af3355322581e1ea11'
[2024-06-07T06:44:31Z INFO ank_agent::workload::workload_control_loop] Successfully created workload 'nginx_from_manifest2'.
[2024-06-07T06:44:31Z DEBUG ank_agent::generic_polling_state_checker] The workload nginx_from_manifest2 has changed its state to ExecutionState { state: Running(Ok), additional_info: "" }
[2024-06-07T06:44:31Z DEBUG ank_agent::agent_manager] Storing and forwarding local workload state 'WorkloadState { instance_name: WorkloadInstanceName { agent_name: "agent_A", workload_name: "nginx_from_manifest2", id: "cfef8c463f359e05114ddd6bbb184a3894944b52deeab966e3b0139e74694cf3" }, execution_state: ExecutionState { state: Running(Ok), additional_info: "" } }'.
Done.
Currently the traces from the agent when creating/replacing/resuming workloads are repetitive and confusing. Here one example:
According to the traces
The traces make the impression that the workload which failed during creation is not created at all. They also do not provide the reason why the workload is created again.
Looking at the debug traces answers some questions:
Current Behavior
Traces are confusing
Expected Behavior
Traces explain what if done by the agent
Steps to Reproduce
Context (Environment)
Ankaios build from main
Logs
Additional Information
Final result
To be filled by the one closing the issue.