ScaleComputing / HyperCoreAnsibleCollection

Official Ansible collection for Scale Computing SC//HyperCore (HC3) v1 API
GNU General Public License v3.0
12 stars 8 forks source link

:lady_beetle: Bug: small `shutdown_timeout` and `force_shutdown` do not work as expecter #288

Open justinc1 opened 6 months ago

justinc1 commented 6 months ago

Describe the bug

VM was booted with Porteus ISO image (ACPI shutdown does work). I tried to update VM with something like:

#  vm_shutdown_timeout in tests/integration/targets/vm/tasks/01_main.yml
    - name: Update the VM
      environment:
        SC_DEBUG_LOG_TRAFFIC: 1
      scale_computing.hypercore.vm: &update-vm
        vm_name: vm-integration-test-vm
        force_reboot: true
        shutdown_timeout: "{{ vm_shutdown_timeout }}"

This works with vm_shutdown_timeout=300. VM needs (I think) about 20 sec to shutdown.

With vm_shutdown_timeout=3 it does not. VM gets SHUTDOWN power action (no tasktag is returned, so module pools for VM state change), VM starts to shutdown, after about 10 sec it gets STOP power action. The STOP tasktag is in running state for a long time, even after VM is already stopped, then it goes into state=ERROR. vm module keeps waiting on STOP tasktag.

It is hard to know in advance what shutdown_timeout is needed - it depends on VM OS and applications.

Maybe module should just ignore the failed STOP power action, since SHUTDOWN power action did work - it just took longer than expected.

Maybe HyperCore API should be update to tolerate this specific corner case. @TomboScaleComputing .

To Reproduce Steps to reproduce the behavior:

# modify vm_shutdown_timeout in tests/integration/targets/vm/tasks/01_main.yml
ansible-test integration vm -v

Expected behavior

VM should shutdown. All is OK if both nice ACPI and force shutdown are tried, and one of them manages to stop the VM.

Screenshots

If applicable, add screenshots to help explain your problem.

System Info (please complete the following information):