ornladios / ADIOS2

Next generation of ADIOS developed in the Exascale Computing Program
https://adios2.readthedocs.io/en/latest/index.html
Apache License 2.0
270 stars 126 forks source link

Ambuiguous ADIOS constructors on HPE MPT MPI where MPI_COMM_WORLD is an integer #2893

Open liangwang0734 opened 3 years ago

liangwang0734 commented 3 years ago

Update 1

Update 2

Unlike some other MPI implementations, the HPE/SGI MPT MPI being used has a different mechanism to define MPI_Comm. In hpe/mpt/2.17r13/include/mpi.h:

enum {
        MPI_COMM_NULL           = 0,
        MPI_COMM_WORLD          = 1,
        MPI_COMM_SELF           = 2,
        _MPI_SGI_COMM_LAST
};

typedef unsigned int            MPI_Comm;

I feel this is a corner case that should be fixed. So, I'm going to re-open the issue. However, the workaround in Update 1 still holds.

Original content

Simple codes like adios2::ADIOS adios(MPI_COMM_WORLD); fail to compile. This is also true for the testings and examples. Possibly due to the cluster's MPI_COMM_WORLD underlying data type vs bool?

Describe the bug On NASA Pleiades, the error exists in both the ADIOS2 built-in tests/examples and my own user code.

~/src/ADIOS2/testing/adios2/engine/bp/TestBPTimeAggregation.cpp(41): error: more than one instance of constructor "adios2::ADIOS::ADIOS" matches the argument list:
            function "adios2::ADIOS::ADIOS(MPI_Comm={unsigned int}, bool)"
            function "adios2::ADIOS::ADIOS(bool)"
            argument types are: (enum <unnamed>)
      adios2::ADIOS adios(MPI_COMM_WORLD);
                          ^

~/src/ADIOS2/testing/adios2/engine/bp/TestBPTimeAggregation.cpp(376): error: more than one instance of constructor "adios2::ADIOS::ADIOS" matches the argument list:
            function "adios2::ADIOS::ADIOS(MPI_Comm={unsigned int}, bool)"
            function "adios2::ADIOS::ADIOS(bool)"
            argument types are: (enum <unnamed>)
      adios2::ADIOS adios(MPI_COMM_WORLD);
$ icpc --version
icpc (ICC) 19.1.3.304 20200925
Copyright (C) 1985-2020 Intel Corporation.  All rights reserved.

$ make
module load comp-intel/2020.4.304; \
module load mpi-sgi/mpt; \
module load hdf5/1.8.18_mpt; \
icpc -DADIOS2_USE_MPI -I/home1/lwang12/src/adios2-install/include -std=gnu++11 -Wl,-rpath,/home1/lwang12/src/adios2-install/lib64 /home1/lwang12/src/adios2-install/lib64/libadios2_cxx11_mpi.so.2.7.1 /home1/lwang12/src/adios2-install/lib64/libadios2_cxx11.so.2.7.1 -lmpi -lmpi++ -Wl,-rpath-link,/home1/lwang12/src/adios2-install/lib64 merge3d-vtk.cpp -o merge3d-vtk
merge3d-vtk.cpp(252): error: more than one instance of constructor "adios2::ADIOS::ADIOS" matches the argument list:
            function "adios2::ADIOS::ADIOS(MPI_Comm={unsigned int}, bool)"
            function "adios2::ADIOS::ADIOS(bool)"
            argument types are: (enum <unnamed>)
    adios2::ADIOS adios(MPI_COMM_WORLD);
                        ^

compilation aborted for merge3d-vtk.cpp (code 2)
Makefile:8: recipe for target 'all' failed
make: *** [all] Error 2

To Reproduce Call adios2::ADIOS adios(MPI_COMM_WORLD);

Expected behavior A clear and concise description of what you expected to happen.

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

Following up Was the issue fixed? Please report back.

williamfgc commented 3 years ago

@liangwang0734 was the issue fixed?

liangwang0734 commented 3 years ago

@williamfgc I added a bool (false) and it compiles and works fine.

However, is the following usage expected?:

adios2::ADIOS adios(MPI_COMM_WORLD);

I saw some tests/examples use this. But on the NASA cluster, it fails to compile.

liangwang0734 commented 3 years ago

@williamfgc @chuckatkins @germasch I've re-opened the issue as it seems to due to the difference in SGI MPT / HPE implementation (and should be fixed, in principle). Here, MPI_COMM_WORLD is an integer in an enum. See the updates in my updated main text.

chuckatkins commented 3 years ago

I believe I've got accounts on a few sgi / hpe machines I can test and debug with. This is a good issue to keep open. Thanks for the bug report. You've got a workaround for now but you're right it should be fixed.

halehawk commented 2 years ago

I got the issue as follows: TestBPChangingShape.cpp(199): error: more than one instance of constructor "adios2::ADIOS::ADIOS" matches the argument list: function "adios2::ADIOS::ADIOS(MPI_Comm={unsigned int}, bool)" function "adios2::ADIOS::ADIOS(bool)" argument types are: (enum ) adios2::ADIOS adios(MPI_COMM_WORLD); Is it the same issue as #2893? How can I solve it? Thanks!

eisenhauer commented 2 years ago

Fun with C++... Yes, this is the same problem as #2893 and can be solved by adding a boolean after the MPI communicator in the init call. (The boolean value doesn't matter, it's deprecated and will be ignored.)

halehawk commented 2 years ago

Great, thanks!

On Thu, Apr 7, 2022 at 11:27 AM Greg Eisenhauer @.***> wrote:

Fun with C++... Yes, this is the same problem as #2893 https://github.com/ornladios/ADIOS2/issues/2893 and can be solved by adding a boolean after the MPI communicator in the init call. (The boolean value doesn't matter, it's deprecated and will be ignored.)

— Reply to this email directly, view it on GitHub https://github.com/ornladios/ADIOS2/issues/2893#issuecomment-1092007818, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACAPEFCYKLTI5WBCV7A2QW3VD4LGPANCNFSM5FGXPYFQ . You are receiving this because you commented.Message ID: @.***>