Ansible task to 'pause' doesn't always wait long enough, or waits when it is not needed

HCL-TECH-SOFTWARE / connections-automation

Deployment and upgrade automation scripts for HCL Connections 7.0 based on Ansible

Apache License 2.0

17 stars 32 forks source link

Ansible task to 'pause' doesn't always wait long enough, or waits when it is not needed #118

Closed andylwelch closed 3 years ago

andylwelch commented 3 years ago

There are several places (33) in the ansible tasks which wait an arbitrary amount of time for the helm upgrade/install command to complete. This is inefficient and not always long enough.

For example:

These should follow the new and improved way to kubectl wait until the pod in question has started. I can see this has been done 8 times already:

Could you please replace the remaining pause tasks, and also considering doubling the --timeout option on the wait command.

dejo1307 commented 3 years ago

Hi Andrew,

This was added just yesterday with the latest release. Can you check and confirm? (I added it myself).

andylwelch commented 3 years ago

Thanks @dejo1307 I see you recently removed a lot of pauses, fantastic. I also see that a lot of the pauses are waiting for was/dmgr etc and are necessary.

There are a few more places that could be changed. For example: setup_elasticsearch.yml line 22 setup_elasticstack.yml line 18 setup_elasticstack7.yml line 18 setup_ingress.yml line 71

It could be a great time to use the wait & timeout flags on the precending helm upgrade command, for example: helm upgrade elasticsearch ... --wait helm upgrade elasticstack ... --wait --timeout 60 helm upgrade elasticstack-7 ... --wait --timeout 60 helm upgrade cnx-ingress ... --wait --timeout 60

The timeout could then be increased as it wont wait the entire time. The default is already 300 so the flag doesn't have to be specified.

I'm not sure if the kubectl wait or helm upgrade --wait can also be used for pvs (setup_connections_volumes.yml) or configmaps (setup_connenv.yml)?

What are your thoughts? Thanks!

dejo1307 commented 3 years ago

It's funny that I thought that --wait was Helm v3 specific, and I just checked the docs and it's not.

It makes sense, let me add it, test it, and will merge it here as soon as we test it internally.

Good point about other stuff mentioned above - someone probably forgot to add it as it is pre 7 stuff (minus ELK Stack 7) but will add that all together.

For PVs and configmaps I don't think we need it. The reason it is there on the first place (and I will clean it up) was historical, where in some cases one command (mostly next one) kicks too early, but stuff is too fast now for something like that to occur. Great point as well, time to clean that too.

dejo1307 commented 3 years ago

Hey @andylwelch,

I've just created new release which contains these fixes as well.

Do you also mind adding yourself in https://github.com/HCL-TECH-SOFTWARE/connections-automation/blob/main/AUTHORS (just create PR and I will approve it).

Thanks a lot, Dejan

ukhare commented 3 years ago

@andylwelch I believe this issue has been fixed, if so could you please close it ?

andylwelch commented 3 years ago

Thanks, I believe this was updated