Restructure engines - Githubissues

nkrah commented 9 months ago

The main purpose of this PR is to move subprocess handling to the Simulation class. Parts of the SimulationEngine can then be simplified. Ideally, the PR will also implement other operations in the Simulation class which require some degree of G4 initialization, e.g. geometry-only visualization.

nkrah commented 9 months ago

DO NOT MERGE YET! still adding some features

nkrah commented 9 months ago

This PR restructures the engines to disentangle simulation engine runs (and related operations), context handling, and sub processing. In particular:

The SimulationEngine is now agnostic about subprocesses. When run, it initializes itself, launches the simulation, and returns the output. This is implemented in SimulationEngine.run_engine()
The simulation engine needs to be created in a context, i.e. in a with Simulation as se clause, to ensure proper handling of the G4 references created by the engine. This is now implemented in the Simulation class, in Simulation._run_simulation_engine(). (NB: in any case, the SimulationEngine has a finalizer attached which should ensure the close() sequence and prevent segfaults, but the context is cleaner).
The method Simulation.run() is the user interface which takes care of properly launching the _run_simulation_engine() method.

In other words:

The user interacts with Simulation.run() and specifies how the run should be done. Currently, the choice is between start_new_process=True or False.
If start_new_process is False, _run_simulation_engine() is run directly and the output assigned to the simulation object later reference.
If start_new_process is True, _run_simulation_engine() is passed through the dispatch_to_subprocess() function which handles the subprocess and queue, returns the collected output from _run_simulation_engine(), and assigns it to the simulation object.
The SimulationEngine class is implemented in a way that all sub engines are already created during the init. In this way, within the engine, one can always assume that all engines exist during the lift time of the SimulationEngine.
SimulationEngine.init takes new_process as input, but this is only a flag for information. There is no sub process handling within the engine!
There is a new module opengate/processing.py with the purpose to collect all code related to handling processes (in a computational sense, not G4 physics processes).
Within this module, there is now a simple function dispatch_to_subprocess which takes as input any function to be run in a subprocess and the arguments and keyword arguments which one would usually pass to the function directly. The queue is added to the arguments automatically via the thin wrapper function target_func.
For example, in the Simulation class, instead of self._get_voxelized_geometry(extent, spacing, margin), the function is run in a subprocess via dispatch_to_subprocess(self._get_voxelized_geometry, extent, spacing, margin).
The advantage of this dispatch mechanism is that the function to be run in the subprocess does not need to implement any queue handling as this is done via the wrapper.
See for example Simulation.run() where self._run_simulation_engine() is run directly or dispatched.

nkrah commented 9 months ago

The PR also implements Simulation.voxelize_geometry() which replaces a utility function previously implemented in opengate.image. Simulation.voxelize_geometry() correctly handles the simulation engine necessary for the voxelization and runs the voxelization in a subprocess. The subprocess is necessary because the user could otherwise not run multiple voxelizations in one script because the previously created G4 UISession instances would collide with those of the subsequent engine. This does not happen if voxelization is capsulated within a subprocess.

The new implementation of Simulation.voxelize_geometry() is flexible in terms of the extent within the world to be voxelized. It can either be a tuple of two points in space (3-vectors) specifying the opposite corners of the box to be voxelized. Or it can be a volume or list volumes whose bounding box is used to determine the box to be voxelized. The PR corrects a bug here, namely that the volumes translation was previously not taken into account. Finally, the user can specify extent='auto' (the default), which prompts Gate to choose the voxelized box in order to enclose all volumes which are attached to the world (or parallel worlds).

The default spacing is now (3,3,3) to avoid large files and long voxelization times. The user can make this smaller if requested.

Simulation.voxelize_geometry() returns the voxelized image and associated labels, but also write this to disk if the keyword argument filename is provided. Unless filename is an absolute path, it is relative to the simulation's global output directory, specified via 'sim.output_dir'.

OpenGATE / opengate

Restructure engines #309