Closed koichi-tsujino closed 1 year ago
Dear koichi-tsujino, Thank you very much for testing cuQuantum 22.11 and reporting the issue. Could you please verify that the environment variables related to setting where the MPI lib is well defined as noted in the docs and if yes, is there are multiple mpi libraries installed on the system such that the wrapper is complied with one while the app is loading the other mpi lib? Could you please to check verify that you are using openMPI for both the wrapper and the app? Can you please also set CUTENSORNET_LOG_LEVEL=5 so we can see more details in the output. Thanks
Could you please try building and running the tensornet_example_mpi_auto C sample on your machine (samples inside https://github.com/NVIDIA/cuQuantum/tree/main/samples/cutensornet)? Before running the sample, could you please additionally check the environment variable $CUTENSORNET_COMM_LIB that is supposed to point to the libcutensornet_distributed_interface_mpi.so wrapper library.
One possible reason why you observe a crash is that the MPI library linked to by the sample you are running is different from the MPI library used by the MPI wrapper libcutensornet_distributed_interface_mpi.so, in case multiple MPI libraries are present in your system. In the meantime, let me try to reproduce your issue locally ...
On our local machine, the C/C++ sampler tensornet_example_mpi_auto works fine with both MPICH and OpenMPI. I would guess the issue could be related to the Python environment setup or something ...
Let's convert this to a discussion thread and continue there, since this is not a bug report.
Under the following setup.
Hardware: INSPUR NF5488M5 (V100 version) environments: Ubuntu 22.04.1 LTS Python 3.9.15 Nvidia driver: 525.60.13 cuda_12.0.r12.0 mpich-4.0.3 mpi4py 3.1.4 cuquantum 22.11.0
When I run
/cuQuantum/python/samples/cutensornet/tensornet_example_mpi.py
, I got. It works .But when I run
/cuQuantum/python/samples/cutensornet/tensornet_example_mpi_auto.py
I got the following error.I have tried other smaples and those works.