ornladios / ADIOS2

Next generation of ADIOS developed in the Exascale Computing Program
https://adios2.readthedocs.io/en/latest/index.html
Apache License 2.0
269 stars 126 forks source link

Multi-block read/write using InSituMPI or SST/FFS corrupts data #1468

Open keichi opened 5 years ago

keichi commented 5 years ago

Please build and run this pipeline to reproduce. The test is essentially a stripped-down version of testing/adios2/engine/bp/TestBPWriteMultiblockRead.cpp. If I run it using BPFile or SST/BP, it works fine. But if I run it using InSituMPI or SST/FFS, the data received on the reader side is corrupted.

BPFile

$ ./writer
$ ./reader
Shape: 12
Count: 12
Received u is: 0 1 2 3 4 5 6 7 8 9 10 11

InSituMPI

Single-block write + Multi-block read:

$ mpirun -n 1 ./writer : -n 1 ./reader
Shape: 12
Count: 12
Received u is: 0 0 0 0 0 0 0 0 0 9 10 11

Multi-block write + Multi-block read:

$ mpirun -n 1 ./writer : -n 1 ./reader
Shape: 12
Count: 12
Received u is: 0 0 0 0 0 0 0 0 0 0 1 2

Single-block write + Single-block read and Multi-block write + Single-block read: Works.

SST/FFS

Multi-block write + Multi-block read:

$ mpirun -n 1 ./writer : -n 1 ./reader
Shape: 12
Count: 12
Received u is: 0 0 0 0 0 0 0 0 0 9 10 11

Multi-block write + Single-block read:

$ mpirun -n 1 ./writer : -n 1 ./reader
Shape: 12
Count: 12
Received u is: 0 0 0 0 0 0 0 0 0 9 10 11

Single-block write + Single-block read and Single-block write + Multi-block read: Works.

SST/BP

Works with all combinations.

pnorbert commented 5 years ago

This needs a lot of rewrite in the InSituMPI engine. It is using old data structures and functions to deliver read schedules between writers and readers. It has to move on to use newer ones to handle multiple blocks per variable per reader.

keichi commented 5 years ago

I did a more exhaustive test and edited the description.

keichi commented 5 years ago

@pnorbert I see... as we discussed, I do have a workaround, so an immediate fix is not necessary.