LLNL / merlin

Machine Learning for HPC Workflows
MIT License
118 stars 26 forks source link

docs/server-cross-node #470

Closed bgunnar5 closed 4 months ago

bgunnar5 commented 5 months ago

I've added instructions to the documentation on how to set up a containerized server to run cross-node workflows. I renamed merlin_server.md to containerized_server.md for more clarity. The new section is at the end of this file titled "Running Cross-Node Workflows with a Containerized Server".

This might be a potential band aid solution for the Redis socket timeout issues. I'd like to get this merged so I can point users here as something to try in order to avoid the timeouts while I look further into the root cause of the problem.

bgunnar5 commented 5 months ago

Link to the page with most of the updates: https://merlin--470.org.readthedocs.build/en/470/user_guide/configuration/containerized_server/