Open davidfrantz opened 4 years ago
There still is a general threading issue in force-higher-level.
It mostly surfaces when using the Level 2 ImproPhe submodule.
I guess it is related to the nested parallelism with OpenMP, wherein 3 teams are used to stream the data. The first team reads data from processing unit pu+1, the second team computes data in pu, and the third team outputs data from pu-1. The teams are working simultaneously. Each team can have multipe sub-threads to do the work parallely.
When doing the work sequentially, i.e. teams work sequentially, this issue does not appear.
I suspect that threads are not re-used and new ones are created instead, and that at some point, the maximum number of allowed threads on the system is reached. But this is only a suspicion..
Related to this: the memory footprint of the process keeps growing - which it doesn't when processing sequentially. I wasn't able to track down the problem. Memchecking with valgrind
didn't show any memory leak.
So how to process the *.prm file sequentially? Do I need to change e.g
NTHREAD_READ = 8
NTHREAD_COMPUTE = 22
NTHREAD_WRITE = 4
to
NTHREAD_READ = 1
NTHREAD_COMPUTE = 1
NTHREAD_WRITE = 1
or should I just avoid to run force-higher-level with parallel
, e.g.
`ls *.prm | parallel -j8 force-higher-level {}
Please note that the error mentioned above occurred running force-higher-level with a single prm file.
Reported by @jakimowb via email.
The Level 2 ImproPhe submodule in force-higher level occassionally throws this error: