Open Shikairan opened 3 months ago
A Warning about the above reply and the link to malware on mediafire (from the author of Curl): https://mastodon.social/@bagder/113038399943924413
A Warning about the above reply and the link to malware on mediafire (from the author of Curl): https://mastodon.social/@bagder/113038399943924413
Thank you!
Hi @Shikairan, thank you for reporting this! Lightning-GPU is bounded with the system support of the NVIDIA cuQuantum
libraries and cuStateVec
supports CUDA capable GPU of generation SM 7.0 (Volta) and greater. Can you try compiling Lightning-GPU + MPI on NVIDIA GPUs with compute capability 7.0+? You may want to use CMAKE_CUDA_ARCHITECTURES
to specify the CUDA architecture at compile time.
CMAKE_CUDA_ARCHITECTURES
I test this docker image and compile lightning.gpu on 4090/4080/3090TI/A800/A100, the all of those GPU cant help to pass the mpitest.
This is unclear from your report if you followed the latest installation guideline or the stable one to build lightning.gpu
+MPI.
To install the master version of lightning.gpu
+MPI in editable mode, you need to use the
--config-settings editable_mode=compat
pip option as shown below:
PL_BACKEND="lightning_gpu" python scripts/configure_pyproject_toml.py
CMAKE_ARGS="-DENABLE_MPI=ON" python -m pip install -e . --config-settings editable_mode=compat -vv
If this didn't resolve the problem, try to install lightning.gpu
regularly with CMAKE_ARGS="-DENABLE_MPI=ON" python -m pip install .
to ensure the package can be found and loaded from site_packages across nodes.
Please let us know if none of the above resolves your issue and don't hesitate to send us the complete build steps and logs in case of failure.
This is unclear from your report if you followed the latest installation guideline or the stable one to build
lightning.gpu
+MPI.To install the master version of
lightning.gpu
+MPI in editable mode, you need to use the--config-settings editable_mode=compat
pip option as shown below:PL_BACKEND="lightning_gpu" python scripts/configure_pyproject_toml.py CMAKE_ARGS="-DENABLE_MPI=ON" python -m pip install -e . --config-settings editable_mode=compat -vv
If this didn't resolve the problem, try to install
lightning.gpu
regularly withCMAKE_ARGS="-DENABLE_MPI=ON" python -m pip install .
to ensure the package can be found and loaded from site_packages across nodes.Please let us know if none of the above resolves your issue and don't hesitate to send us the complete build steps and logs in case of failure.
I tried to compile this project since last month, both compile cmds had been tried, but still failed. Both of them return the same error which I mentioned above. I will tried to compile again to collect all the logs in the docker, all the logs will be upload next week, the project will be compiled on a machine with two 3090TI.
Hey @Shikairan, I'm just following up on this issue. Were you able to compile and test Lightning-GPU with MPI?
Hey @Shikairan, I'm just following up on this issue. Were you able to compile and test Lightning-GPU with MPI?
Here is the latest log: base env.txt penny-lightning install log.txt
split each step by string "================================================================"
I encounter same issue。
mpirun -np 2 python -m pytest mpitests --tb=short -x
============================= test session starts ==============================
platform linux -- Python 3.10.13, pytest-8.3.3, pluggy-1.5.0
rootdir: /data/whc/pennylane-lightning
configfile: pyproject.toml
plugins: flaky-3.8.1, xdist-3.6.1, mock-3.14.0, cov-6.0.0
collected 3736 items
mpitests/test_adjoint_jacobian.py ============================= test session starts ==============================
platform linux -- Python 3.10.13, pytest-8.3.3, pluggy-1.5.0
rootdir: /data/whc/pennylane-lightning
configfile: pyproject.toml
plugins: flaky-3.8.1, xdist-3.6.1, mock-3.14.0, cov-6.0.0
collected 3736 items
mpitests/test_adjoint_jacobian.py EE
==================================== ERRORS ====================================
_______ ERROR at setup of TestAdjointJacobian.test_not_expval[dev0-True] _______
mpitests/test_adjoint_jacobian.py:51: in fixture_dev
return qml.device(
/root/anaconda3/envs/mpi310/lib/python3.10/site-packages/pennylane/devices/device_constructor.py:280: in device
dev = plugin_device_class(*args, **options)
pennylane_lightning/lightning_gpu/lightning_gpu.py:354: in __init__
self._statevector = self.LightningStateVector(
pennylane_lightning/lightning_gpu/_state_vector.py:101: in __init__
self._qubit_state = self._state_dtype()(
E pennylane_lightning.lightning_gpu_ops.LightningException: [/data/whc/pennylane-lightning/pennylane_lightning/core/src/simulators/lightning_gpu/MPIWorker.hpp][Line:178][Method:make_shared_mpi_worker]: Error in PennyLane Lightning: custatevec dynamic library load failure
I follow instruction in https://docs.pennylane.ai/projects/lightning/en/stable/lightning_gpu/installation.html#id1 to compile from source ,run testcase on centos with 2gpu of 3090ti ,cuda==12.1.
Tested on cuquatum container,use pip install,got issue: File "/opt/conda/envs/cuquantum-24.03/lib/python3.10/site-packages/pennylane_lightning/lightning_gpu/lightning_gpu.py", line 297, in _mpi_init_helper raise ImportError("MPI related APIs are not found.") ImportError: MPI related APIs are not found.
Hey @kevzos and @Shikairan ,
Thanks for your interests in the distributed Lightning.GPU and reporting the issue.
Would you please help to check if adding path\to\libmpi.so
to the LD_LIBRARY_PATH
env work?
Please feel free to reach out if there is any issue.
Thanks,
I can't pass the mpitests with cmd "mpirun -np 2 -env UCX_NET_DEVICES=eth0 python -m pytest mpitests --tb=short" and cmd "mpirun -np 2 python -m pytest mpitests --tb=short".
The pytest return the error: "pennylane_lightning.lightning_gpu_ops.LightningException: [/home/pl/pl5/pennylane-lightning-master/pennylane_lightning/core/src/simulators/lightning_gpu/MPIWorker.hpp][Line:178][Method:make_shared_mpi_worker]: Error in PennyLane Lightning: custatevec dynamic library load failure".
I compile mpi, ucx and lightning.gpu with mpi in the docker image <nvidia/cuda:12.0.0-cudnn8-devel-ubuntu22.04>(IMAGE ID : bc9059f96b2a).
1: compile the mpich-4.2.2 with source, use cmd: ./configure --prefix=/my/path --with-device=ch4:ucx --with-cuda=/my/cuda/path I can pass the example in the mpi package, include the, , <cuda/cudapi test>
2: compile the ucx-1.7.0 with source, use cmd: ../configure --prefix=/my/own/path It can pass test by using cmd: "mpirun -np 2 -env UCX_NET_DEVICES=eth0 ./cuda/cudapi" in mpi examples-test.
So, it seems like the base enviroment can work.
Then I follow the steps in the pennylane-lightning to install lightning.gpu with mpi. 1: try to pip the requirement.txt and requirement-dev.txt in different conda enviromnet. I try the two requirement both. 2: follow the steps in the Lightning-GPU installation Then I can't pass the pytest of mpi-test. The error detail is above.
If i use pip to install lightning.gpu (without mpi, only gpu vision), I can pass the pytest in tests. So the custatevec can work in plan.
The log of installing: mpilightning.gpu.install.log