BoothGroup / Vayesta

A Python package for wave function-based quantum embedding
Apache License 2.0
31 stars 7 forks source link

Limit to ERI size for mpi cast #150

Open efertitta opened 10 months ago

efertitta commented 10 months ago

It seems that above a certain size, the ERI cannot be casted from the master rank to all other. I am forced to shut down mf = mpi.scf(mf) and run the scf on each task instead. The error below implies that there is a max size of 2GB for arrays to be casted. It appears (https://github.com/mpi4py/mpi4py/issues/119) that it is a known issue that can be overcome in newer version of MPI?

File "mpi4py/MPI/msgbuffer.pxi", line 250, in mpi4py.MPI.message_simple File "mpi4py/MPI/msgbuffer.pxi", line 511, in mpi4py.MPI._p_msg_cco.for_cco_recv File "mpi4py/MPI/msgbuffer.pxi", line 50, in mpi4py.MPI.downcast File "mpi4py/MPI/msgbuffer.pxi", line 495, in mpi4py.MPI._p_msg_cco.for_cco_send OverflowError: integer 2372038530 does not fit in 'int'

obackhouse commented 10 months ago

by far the most annoying thing when working with mpi4py... https://github.com/pyscf/pyscf/blob/master/pyscf/agf2/mpi_helper.py has blocking loops that make sure this limit isn't triggered - we depend on PySCF so we can use these, or just copy them into the Vayesta MPI module

basilib commented 10 months ago

This might be related to the issue addressed in this PR: #76. If you are broadcasting numpy arrays you can use Bcast instead of bcast to overcome this limit.

maxnus commented 10 months ago

mpi.scf actually already uses this wrapper - is it possible that both Bcast and bcast have the 2GB limit (to be honest it would be weird if this wasn't the case...)