Open inf17101 opened 2 months ago
After analyzing the issue, the server does not send the WorkloadState with ExecutionState::Removed
for workloads that are in Pending::Initial
and were removed due to the previous executed update state. To handle this situation correctly and to introduce a proper fix, the server needs to know about the connected agents. The list agents feature #155 is scheduled for the next release, so the bug fix relying on this feature, will be moved to the next release, too.
Currently working on this. Since the topic is a little bit more complex to solve then initially thought (there where more stucking cases like for unscheduled workloads. And also how the wait list is initialized currently on main branch with data of the requested complete state is not correct for some corner cases).
I have commit to the branch https://github.com/eclipse-ankaios/ankaios/tree/320_fix_deleted_pending_workloads a first solution for most of the stucking cases. However, currently there is some basic implementation for testing that the stucking wait list and missing information in the table of deleted pending workloads is fixed. Never the less, I need to refactor the code (I want to get rid of the BTreeMap which I thought I need initially when starting to implement the bug fix, and in addition I will eliminate the second get_complete_state request since the new added workloads can be taken from the new constructed complete state.)
I will continue on this after my vacation, since tests and requirements must be adapted as well.
When using
ank set state
in the wait mode and the passed new complete state removes workloads that are in the current desired state but were not initially started because their Ankaios agent is not running, then the wait mode stucks and hangs up.Current Behavior
Ank CLI set state hangs up:
Expected Behavior
Set state shall know that they are not initially started but removed. It shall not stuck.
Steps to Reproduce
To reproduce the issue you can use the provided example startConfig.yml inside the repository. The config contains one workload for agent_A and three other workloads for agent_B.
Context (Environment)
Ank CLI ank set state command all supported platforms
Logs
Additional Information
Final result
To be filled by the one closing the issue.