redhat-cop / osia

Tool for reliable automated deployments of OpenShift Container Platform 4.x into OpenStack and AWS.
https://osia-python.rtfd.io
Apache License 2.0
12 stars 12 forks source link

Cluster Deletion #65

Open rnc opened 1 year ago

rnc commented 1 year ago

I noticed this when accidentally running clean twice.

The trace looks like:

osia clean --cluster-name=nick
2023-02-09 14:37:27 install.py:151    INFO Found installer at installers/4.12.2
FATAL Failed while preparing to destroy cluster: open nick/metadata.json: no such file or directory 
2023-02-09 14:37:27 executor.py:102   ERROR Re-executing installer due to error (InstallerExecutionException(...), 'Failed execution of installer')
FATAL Failed while preparing to destroy cluster: open nick/metadata.json: no such file or directory 
2023-02-09 14:37:27 executor.py:102   ERROR Re-executing installer due to error (InstallerExecutionException(...), 'Failed execution of installer')
2023-02-09 14:37:27 storage.py:64     INFO Removing cluster directory from git repository nick
Traceback (most recent call last):
  File "/home/rnc/.virtualenvs/osia-_uzfVagR-py3.11/bin/osia", line 5, in <module>
    main_cli()

I think this is due to the perhaps slightly confusing try/catch in delete_cluster. If that try/catch is removed then i get

osia clean --cluster-name=nick2
2023-02-09 14:44:37 install.py:151    INFO Found installer at installers/4.12.2
FATAL Failed while preparing to destroy cluster: open nick2/metadata.json: no such file or directory 
Traceback (most recent call last):
  File "/home/rnc/.virtualenvs/osia-_uzfVagR-py3.11/bin/osia", line 5, in <module>
    main_cli()
  File "/home/rnc/Work/OpenStack/osia/osia/cli.py", line 211, in main_cli
    args.func(args)
  File "/home/rnc/Work/OpenStack/osia/osia/cli.py", line 130, in _exec_delete_cluster
    delete_cluster(conf['cluster_name'], conf['installer'])
  File "/home/rnc/Work/OpenStack/osia/osia/installer/executor.py", line 99, in delete_cluster
    execute_installer(installer, cluster_name, 'destroy')
  File "/home/rnc/Work/OpenStack/osia/osia/installer/executor.py", line 43, in execute_installer
    raise InstallerExecutionException("Failed execution of installer")
osia.installer.executor.InstallerExecutionException: (InstallerExecutionException(...), 'Failed execution of installer')

which seems a bit easier to understand.

rnc commented 1 year ago

Question - what is the for loop for (added in https://github.com/redhat-cop/osia/commit/7077f5a2287c6440353a4c0ba2021d85950629ec ) ?

mijaros commented 1 year ago

@rnc the reason for the reattempt to execute delete is due to the instability of openshift-install especially in the openstack environment. It tends to happen, that the cluster removal fails and leaves behind some residue resources, the re-execution mitigates vast majority of these cases - yes it still can happen that the second attempt will fail, then the resulting terraform files are left intact, but it can solves the issue in many situations.