Closed epatyukova closed 3 weeks ago
Could you please be a bit more specific how you are monitoring internet traffic and how you conclude it is due to AiiDA? The example you show should not have any network connections as far as I know. What is the error or problem if you try to run the example without internet connection? How is the Computer
defined that you use in the calculation? Is it really the localhost
machine?
(1) I could see the amount of traffic on my mobile network operator account (I tried to run calculations on holiday where no other internet connection was available). I'm not sure about the precise amount. I'm not a specialist in networks, so can't give a better description, sorry. (2) If I disconnected the internet, calculations failed, (aiida.out was empty as pw.x was not started. At the same time all aiida files were created, so aiida machinery was working, but calculations were not done.) (3) I defined computer and the code as described on AiiDA website. With localhost, core.local transport, core.direct scheduler, as described here https://aiida.readthedocs.io/projects/aiida-core/en/latest/howto/run_codes.html#how-to-run-codes.
Where the _scheduler-stderr.txt
and _scheduler-stdout.txt
files present in the working directory? What was their content? And what is the output of verdi process report <PK>
for the failed calculation?
_scheduler-stderr.txt:
_scheduler-stdout.txt is empty
*** 4 LOG MESSAGES: +-> WARNING at 2024-10-29 18:01:51.331929+00:00 | key 'symmetries' is not present in raw output dictionary +-> ERROR at 2024-10-29 18:01:51.347520+00:00 | ERROR_OUTPUT_STDOUT_INCOMPLETE +-> ERROR at 2024-10-29 18:01:51.349059+00:00 | Both the stdout and XML output files could not be read or parsed. +-> WARNING at 2024-10-29 18:01:51.349968+00:00 | output parser returned exit code<305>: Both the stdout and XML output files could not be read or parsed.
Thanks for the additional details
The problem for the calculation not running is shown in the _scheduler-stderr.txt
. Apparently, on your machine the script is trying to launch a PMIx server listener. I am not familiar with this tool, but how did you configure the local computer? What kind of MPI are you expecting to be used? Can you share the output of verdi computer show localhost
and verdi computer configure show localhost
?
Whatever you have configured, it seems it may be this that is actually trying to connect to the outside world. This is not something built into AiiDA though.
Thank you for the comment. I did not do anything with PMIx server listener (maybe there are some system configurations created by IT which are one of the reasons, I do not know), I just followed AiiDA manual. It is strange though that I can't do any calculations locally offline.
verdi computer show qe-computer: Label qe-computer PK 1 UUID afaf9e95-2647-417f-8bd4-cfcb72bb143f Description Hostname localhost Transport type core.local Scheduler type core.direct Work directory /Users/elena.patyukova/Documents/github/aiida-work Shebang #!/usr/bin/env python3 Mpirun command mpirun -np 4 Default #procs/machine 4 Default memory (kB)/machine Prepend text Append text
verdi computer configure show qe-computer:
On internet they write that it is RabbitMQ who uses PMIx server listener. So, it is probably the issue with RabbitMQ.
On internet they write that it is RabbitMQ who uses PMIx server listener. So, it is probably the issue with RabbitMQ.
RabbitMQ is not being managed from inside a job, so it wouldn't show up in these output files, I am pretty sure. Could you share the content of the _aiidasubmit.sh
script? And what kind of machine are you running AiiDA? Is it your personal laptop, or a workstation, or on some remote compute cluster? How did you compile/install QE itself?
Thank you. So, the _aiidasubmit.sh is
exec > _scheduler-stdout.txt exec 2> _scheduler-stderr.txt
"mpirun" "-np" "4" '/Users/elena.patyukova/Documents/github/q-e/bin/pw.x' '-in' 'aiida.in' > "aiida.out"
I run it on my laptop. I installed QE from source, following installation instructions in their repository. I just want to add that everything is working if internet connection is on. I do not change anything apart from turning on the internet connection. So, it is wierd.
It is indeed weird, don't really understand where it could be coming from. Did you compile QE with MPI support? Maybe just try running it without MPI. The aiida-quantumespresso calculation launch pw
should have an option like --without-mpi
(check the help for the exact form of the option) to disable MPI.
Yes, you are right, without MPI it works. So the reason is in MPI.
However, the solutions suggested here https://stackoverflow.com/questions/78348267/the-pmix-servers-listener-thread-failed-to-start-we-cannot-continue is not working and here https://stackoverflow.com/questions/16077460/does-rmpi-require-an-active-internet-connection/16121528#16121528 are not working.. (I tried to modify the command in a computer setup from mpirun -n 4 to mpirun --mca btl_tcp_if_include lo0 -n 4 and also to set env variable in my python script, both did not work).
Glad we pinned it down to whatever version of MPI is installed on your system. There is not much more we can do for you I am afraid, as this is not an AiiDA problem. You would have the same problem if you run QE with MPI directly without AiiDA. So I will close this issue for now.
Hello! I have observed that I can run this tutorial https://aiida-quantumespresso.readthedocs.io/en/latest/tutorials/first_pw.html#tutorials-pw-through-cli for running pw.x through API only if I have internet connection, though I'm running calculations locally. If I do not have an internet connection pw.x is not started (the output file aiida.out is generated, but is empty). If I have an internet connection, all is working, but the amount of internet traffic consumed is considerable, though I run calculations locally.
Can you please explain what is going on? Thank you!