nlpsandbox / nlpsandbox-infra

AWS CloudFormation templates for deploying the NLP Sandbox infrastructure
Apache License 2.0
0 stars 1 forks source link

Configure the containers of the infra to always restart #36

Closed tschaffter closed 3 years ago

tschaffter commented 3 years ago

The idea is that the containers of the infrastructure restart if they exit for a reason other than being manually stop. This change should contribute to fix the issue where the controller container dies sometimes during weekly Synapse updates.

tschaffter commented 3 years ago

Tagging @gkowalski because we discussed this issue the controller issue this morning.

@thomasyu888 What is the typical time range when Synapse is down for its weekly update?

tschaffter commented 3 years ago

@thomasyu888 Can you apply this update?

thomasyu888 commented 3 years ago

I'm going to leave ELK alone for now. Typical time range that Synapse is down for update is 15-30 minutes I think.

tschaffter commented 3 years ago

Aside setting the restart policy for the controller, do you know how the controller would need to be modified in order to prevent its crash when Synapse is down? Maybe this could be a separate ticket.

thomasyu888 commented 3 years ago

The client is actually configured to do so but it's not effective 100% of the time.

tschaffter commented 3 years ago

Does the check in the boxes means that the change is effective? If yes, we can close this ticket.

thomasyu888 commented 3 years ago

Yes, the check in the boxes means the change is effective on the Sage side.

tschaffter commented 3 years ago

I see that the docker-compose of the data-node has the restart policy. What does George have to do to apply the update for the controller? I don't see a docker-compose to run.

thomasyu888 commented 3 years ago

@tschaffter, you added the restart policy in the data node: https://github.com/nlpsandbox/data-node/blob/main/docker-compose.yml. All he has to do is pull the image down and restart the data node and it should be applied.

tschaffter commented 3 years ago

I know. My question was for the controller.

thomasyu888 commented 3 years ago

Ah, clearly i'm on labor day mode and not reading carefully. Please review: https://github.com/Sage-Bionetworks/SynapseWorkflowOrchestrator/pull/32/files. Once merged, it is the same workflow as how he can update the data node.

gkowalski commented 3 years ago

@tschaffter Am I good to go to pull down this latest code ?

tschaffter commented 3 years ago

@gkowalski Yes. In the future we will write down specific instructions for updating the components of the infrastructure but for now let's move forward. The update consists of the following:

gkowalski commented 3 years ago

updated our data-node

cd ~/data-node/ stop data-node git pull docker-compose up -d