ElmerCSC / elmerfem

Official git repository of Elmer FEM software
http://www.elmerfem.org
Other
1.14k stars 310 forks source link

Running multiple Elmer configurations at once sharing the same detached XIOS instance #451

Closed lucas-ige closed 3 months ago

lucas-ige commented 4 months ago

Hello,

An execution ID for Elmer is required when using XIOS (cf. xios_initialize). In an MPI run that calls multiple programs (MPMD), this ID is used to group MPI processes by program, so that each program can implement internal parallelization without interfering with other programs. As of now, this ID is hard-coded (variable xios_id in fem/src/Types.F90, with value "elmerice"). It is therefore not possible to run different configurations of Elmer in detached XIOS mode simultaneously in a single MPI run because they all have the same ID while they should have different IDs.

I am hereby suggesting a commit to make this possible. With this commit, the user can specify the ID at run-time using the -exec-id command line argument. For example:

mpirun -np 10 ./ElmerSolver_mpi greenland.sif -exec-id greenland : -np 1 ./xios

For backward compatibility, -exec-id can be omitted, and it defaults to "elmerice".

Internally, the variable xios_id has been renamed to ExecID because it can be used outside of XIOS (for example: external couplers like OASIS may rely on such an ID).

This has applications in climate modelling where multiple configurations of Elmer may need to be run at once, sharing a single XIOS server (for example: one for Greenland and one for Antarctica). This could also be useful for ensemble modelling.

Please note that the context ID in Elmer's XIOS configuration files should match the value of ExecID.

I updated the XIOSOutputSolver's documentation accordingly. There might be other places where the documentation should also be updated?

I am looking forward to hearing your thoughts regarding this suggestion.

Cheers,

Lucas Bastien

lucas-ige commented 4 months ago

This is the pull request I am referring to.

tzwinger commented 4 months ago

I merged - please test and let me know whether we can close this issue

lucas-ige commented 4 months ago

Hello, thank you for the merge.

I tested the merged code: it works as intended (I ran one simulation with two icesheets at once and another simulation with a single icesheet to test the default backward-compatible behavior).

This issue can be closed, as far as I am concerned. Quick question though: I documented this -exec-id feature in the documentation of the XIOSOutputSolver; should it also be documented somewhere else? If yes, I'm happy to do it. If no, then feel free to close this issue.