Closed EleBern closed 1 year ago
Hi EleBern, There seem to be two conflicts using conda with NetPyNE. 1) First, you can run tutorial 8 without calling mpi. For example,
nrniv -python tut8_batch.py
will actually provide all the information about submitting jobs, but when NetPyNE actually picks up the first, it gets frozen. This seems to arise from the management of stdout and stderr in Conda environment. So, here there are two options:
1a) Comment the lines associated to write the stdout and stderr when finalizing the jobs (lines 82 and 83 of the "runJob" method in "grid.py" within the "batch" package). But maybe best is to... 1b) Run the program as "conda run". So in your conda prompt terminal, you should write
conda run nrniv -python tut8_batch.py
In this case, by default "conda run" buffers all I/O and so you get all jobs done, but the communication with you is displayed at the end.
2) Second, as soon as the above issue is solved, a new one arises with mpi. In this case, you can run all as "conda run". For example,
mpiexec -n 4 conda run nrniv -python -mpi tut8_batch.py or conda run mpiexec -n 4 conda run nrniv -python -mpi tut8_batch.py
and you get all the results (otherwise, without "conda run", you get n-1, being n the number of processes). But you don't get any feedback and actually you won't get the control again. It seems that the subprocesses opened in each processor do provide something that Conda misses up and NetPyNE keeps waiting for something that never comes.
Short story with conda. Use "conda run". Without mpi, everything works fine but the communication is got at the end. With mpi, you get everything written on files, but the communication is never displayed (and you should stop the program because it gets stuck in a neverending while). All the best,
Eugenio
Hi Eugenio,
Thank you very much for your detailed answer.
Best wishes, Eleonora
On Tue, Apr 26, 2022, 01:31 urdapile @.***> wrote:
Hi EleBern, There seem to be two conflicts using conda with NetPyNE.
- First, you can run tutorial 8 without calling mpi. For example,
nrniv -python tut8_batch.py
will actually provide all the information about submitting jobs, but when NetPyNE actually picks up the first, it gets frozen. This seems to arise from the management of stdout and stderr in Conda environment. So, here there are two options:
1a) Comment the lines associated to write the stdout and stderr when finalizing the jobs (lines 82 and 83 of the "runJob" method in "grid.py" within the "batch" package). But maybe best is to... 1b) Run the program as "conda run". So in your conda prompt terminal, you should write
conda run nrniv -python tut8_batch.py
In this case, by default "conda run" buffers all I/O and so you get all jobs done, but the communication with you is displayed at the end.
- Second, as soon as the above issue is solved, a new one arises with mpi. In this case, you can run all as "conda run". For example,
mpiexec -n 4 conda run nrniv -python -mpi tut8_batch.py or conda run mpiexec -n 4 conda run nrniv -python -mpi tut8_batch.py
and you get all the results (otherwise, without "conda run", you get n-1, being n the number of processes). But you don't get any feedback and actually you won't get the control again. It seems that the subprocesses opened in each processor do provide something that Conda misses up and NetPyNE keeps waiting for something that never comes.
Short story with conda. Use "conda run". Without mpi, everything works fine but the communication is got at the end. With mpi, you get everything written on files, but the communication is never displayed (and you should stop the program because it gets stuck in a neverending while). All the best,
Eugenio
— Reply to this email directly, view it on GitHub https://github.com/suny-downstate-medical-center/netpyne/issues/673#issuecomment-1109138598, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATUGHT5ZNBMMCUA3LDGOMW3VG4TN7ANCNFSM5QSFGJLQ . You are receiving this because you authored the thread.Message ID: @.***>
Hello, I am working on resolving Windows and NetPyNE issues, It seems there was a satisfactory solution found so I will close this issue for now.
When running tutorial 8 on Windows I get the results for 2 simulations when using 3 cores and of 3 simulations when using 4 cores. The total number of batches is 9. I don't get any errors or warnings on the console, but it stays indefinitely pending.
I'm using: Python 3.9.7 Conda 4.9.2 Netpyne 1.0.0.2