Closed tschaffter closed 3 years ago
Tagging @gkowalski because we discussed this issue the controller issue this morning.
@thomasyu888 What is the typical time range when Synapse is down for its weekly update?
@thomasyu888 Can you apply this update?
I'm going to leave ELK alone for now. Typical time range that Synapse is down for update is 15-30 minutes I think.
Aside setting the restart policy for the controller, do you know how the controller would need to be modified in order to prevent its crash when Synapse is down? Maybe this could be a separate ticket.
The client is actually configured to do so but it's not effective 100% of the time.
Does the check in the boxes means that the change is effective? If yes, we can close this ticket.
Yes, the check in the boxes means the change is effective on the Sage side.
I see that the docker-compose of the data-node has the restart policy. What does George have to do to apply the update for the controller? I don't see a docker-compose to run.
@tschaffter, you added the restart policy in the data node: https://github.com/nlpsandbox/data-node/blob/main/docker-compose.yml. All he has to do is pull the image down and restart the data node and it should be applied.
I know. My question was for the controller.
Ah, clearly i'm on labor day mode and not reading carefully. Please review: https://github.com/Sage-Bionetworks/SynapseWorkflowOrchestrator/pull/32/files. Once merged, it is the same workflow as how he can update the data node.
@tschaffter Am I good to go to pull down this latest code ?
@gkowalski Yes. In the future we will write down specific instructions for updating the components of the infrastructure but for now let's move forward. The update consists of the following:
docker-compose.yml
has been updated 15 days ago to include the restart policy. If you don't know how to update we can provide instructions.updated our data-node
cd ~/data-node/ stop data-node git pull docker-compose up -d
The idea is that the containers of the infrastructure restart if they exit for a reason other than being manually stop. This change should contribute to fix the issue where the controller container dies sometimes during weekly Synapse updates.