Open ndarmage opened 4 years ago
Code upgraded with a parallel process pool for the calculation of the adjoint functions (and contribution function CF) in each time step, see commit 04f39c2eb5738745cfb9521542da05411aa2eefc. Some parallel blocks are not used, reverting to serial calculation when running on Windows OS (or any other non-posix platform). The use of process pools is efficient when processes are forked (with copy-on-write data), and not spawned as in WinOS. The last method causes hard data copies for each spawned process. The implementation should be revised in the future to port the code efficiently on any platform. This is however a low priority task, and the use of the code under unix-like systems is currently recommended for any user with runtime and memory concern.
Parallelism by multiprocessing shows bad performances with the current implementation (v1.5.0). Processes are spawned on Win OS, while they're forked in posix system by default. Spawning forces copying of input arguments, which degrades the performances.