ornladios / ADIOS2

Next generation of ADIOS developed in the Exascale Computing Program
https://adios2.readthedocs.io/en/latest/index.html
Apache License 2.0
271 stars 126 forks source link

Failed to use MPI to write into file in Python #3391

Open Haoju-Leng opened 1 year ago

Haoju-Leng commented 1 year ago

I am using adios2 installed from conda, python3.7 under ubuntu 22.04. I was trying to write 4 simple numpy arrays into the same variable using MPI4py. However, when I read the bpFile, the data wrote by previous ranks were overwrote by the last rank. It seems that the processes defined the variable with the same name separately which caused the overwrite happened. I used 4 processes for the script. Below is my code:

from mpi4py import MPI
import numpy
import adios2

# MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

# User data
myArray = numpy.array([0, 1., 2., 3., 4., 5., 6., 7., 8., 9.])
Nx = myArray.size

# ADIOS MPI Communicator, debug mode
adios = adios2.ADIOS(comm)

# ADIOS IO
bpIO = adios.DeclareIO("BPFile_N2N")
bpIO.SetEngine('bp4')

bpFileWriter = bpIO.Open("clinical_patches/npArray.bp", adios2.Mode.Write)

# ADIOS Variable name, shape, start, offset, constant dims
ioArray = bpIO.DefineVariable(
    "bpArray", myArray, [size * Nx], [rank * Nx], [Nx], adios2.ConstantDims)

# ADIOS Engine
bpFileWriter.Put(ioArray, myArray, adios2.Mode.Sync)
bpFileWriter.Close()

The output when I simply inquire the variable and print out the data supposes to be [0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 0. 1. 2. 3. 4. 5. 6. 7. 8. 9.] Instead, the output was [0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

Haoju-Leng commented 1 year ago

Do I need to configure/linking/build anything after I install the adios2 for python via conda? It seems the adios2 is not using MPI even though I passed the MPI communicator to adios and no error has been throwed.

williamfgc commented 1 year ago

@Haoju-Leng your example defines a new variable per rank, so the dataset output is what's expected. All ranks should be writing to the same common variable for what you want to achieve. I encourage to use bpls on the resulting datasets. Check examples and tests for adios2 tested usage as well. Hope it helps.

Haoju-Leng commented 1 year ago

@williamfgc Thanks for the help. In the example, I defined the variable with the same name for every rank but with different local start and offset. Are you suggesting to use Inquirevariable() to get the variable defined from the other rank? I tried to use it, but it always failed to get the variable which means that the variable ("bpArray" in this case) has not been defined from the other rank.

The code I tried to add to the code above:

if bpIO.InquireVariable("bpArray"):
    ioArray = bpIO.InquireVariable("bpArray")
else:
    ioArray = bpIO.DefineVariable(
        "bpArray", myArray, [size * Nx], [rank * Nx], [Nx], adios2.ConstantDims)

I used the same logic in C++ with adios2 and MPI and it worked, but I am not sure why the Python version was failed.

Do I need to configure/linking/build anything after I install the adios2 for python via conda? I just tried multiple tests for adios2 tested usage on github, they are all failed. So, I am not sure if it is a configuration issue or my implementation issue.

williamfgc commented 1 year ago

The new version of your code snippet should work, can you run a hello world example from mpi4py and verify it's running in parallel? As for conda builds, a variant with mpi should be installed, building from source might also help discard it's a conda installation issue. It's also good to check if MPI is initialized on the mpi4py side.

pnorbert commented 1 year ago

@Haoju-Leng Do you still have issues with this or can we close this ticket? Thank you