vlas-sokolov / bayesian-ngc1333

Nested sampling of the GAS DR1 NGC1333 ammonia data
MIT License
2 stars 3 forks source link

multiple processes opened using large memories #8

Open SpandanCh opened 3 years ago

SpandanCh commented 3 years ago

@vlas-sokolov, I am running the code on a synthetic cube, and I noticed that after around 6-7 hours, some processes start popping up, each of which uses a lot of memory (~ 37 GB). My data cubes are only 200MB and 10MB. In htop, the processes are noted as orted --hnp --set-sid --report-uri 9 --singleton-died-pipe 10 -mca state_novm_select_1

This appears to be a MPI-related process (https://github.com/open-mpi/ompi/issues/4577). If I kill one of these processes, all the others, as well as the python processes terminate.

Have you come across something similar? This appears to be a new issue after a recent upgrade of the system. Could there be an issue of the version of one of the packages?

vlas-sokolov commented 3 years ago

Never happened to me, but it looks bad! Is the issue the same as the linked one, with infinite processes spawning? In that case (and maybe other cases too), have you tried upgrading your OpenMPI version?

jpinedaf commented 3 years ago

@SpandanCh could you check what is the OpenMPI version installed?

SpandanCh commented 3 years ago

@SpandanCh could you check what is the OpenMPI version installed?

It's 2.1.1

vlas-sokolov commented 3 years ago

looks like nowadays they go all the way to 5.x, and back when I was working on this 3.x was already around - give updating a try maybe?

vlas-sokolov commented 3 years ago

that being said, it's my only lead, and I have no clue what else could be wrong here

SpandanCh commented 3 years ago

Ok, thanks. I will try updating