multiscale-cosim / EBRAINS-cosim

EBRAINS-cosim
Other
5 stars 0 forks source link

co-simulation orchestrator V2 #6

Closed w-klijn closed 2 years ago

w-klijn commented 4 years ago
Aspect Detail
Summary The orchestrator is responsible for runtime coordination of information exchange
Task Area functionality
Assignee
Information
Prerequisites
Dependencies

Summary

Multi-scale sees a lot of different components exchanging data at runtime. Additionally the different simulators need to synchronize time steps, exchange MPI configurations, register port for exchange of information. Not all this information is know at configuration time and thus needs to be exchanged at runtime. To prevent a multitude of point to point communications a single access point (presumably using a message queue) will be available for each simulator to send and retrieve these types of information.

Additionally the orchestrator will provide basic status information of the different components deployed.

Tasks

Requirements

Acceptance criteria

lionelkusch commented 4 years ago

Sandra (@sdiazpier) already start to think about the communication protocol for the orchestrator. https://docs.google.com/spreadsheets/d/1yApDi33wwxcuMYCAhVOQoRILQSXdQE9kzX0gI1EsXkA/edit#gid=0

lionelkusch commented 4 years ago

In the proof of concept co-simulation TVB-Nest (https://github.com/multiscale-cosim/TVB-NEST), I separate the orchestrator to the simulators and translator modules. This example can help you to define what is important for the API.

dionperd commented 3 years ago

See also this comment: https://github.com/multiscale-cosim/EBRAINS-cosim/issues/21#issuecomment-711913337

maedoc commented 3 years ago

I would suggest etcd as a potential candidate for a single-source-of-truth key value store which could replace the endless config files. Being non-MPI may be advantage as a sort of out-of-band channel if/when MPI issues arise.

w-klijn commented 3 years ago

We will have a first version in the MVP. Pushed this to the next phase

w-klijn commented 2 years ago

This is a very old task which is not up to data. Close it and make a new tciked if needed