qiboteam / qibojit

Accelerating Qibo simulation with just-in-time compilation.
https://qibo.science
Apache License 2.0
15 stars 3 forks source link

Multi-node #151

Closed alecandido closed 9 months ago

alecandido commented 1 year ago

@stavros11 just to avoid repeating something already done: I'm aware of https://github.com/qiboteam/qibojit/blob/0519c0e2273563d6439583fdca8d3010acb5dd19/src/qibojit/backends/gpu.py#L787 but you might have made a benchmark anyhow.

Do you still have that code?

(Most likely it's not a big deal, but if I can check myself against something external I'm a bit more confident about my own implementation)

codecov[bot] commented 1 year ago

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Files Coverage Δ
src/qibojit/backends/multinode.py 0.00% <0.00%> (ø)

:loudspeaker: Thoughts on this report? Let us know!.

stavros11 commented 1 year ago

@stavros11 just to avoid repeating something already done: I'm aware of

https://github.com/qiboteam/qibojit/blob/0519c0e2273563d6439583fdca8d3010acb5dd19/src/qibojit/backends/gpu.py#L787

but you might have made a benchmark anyhow. Do you still have that code?

(Most likely it's not a big deal, but if I can check myself against something external I'm a bit more confident about my own implementation)

For benchmarking, qibojit-benchmarks or the benchmark example in qibo should work on GPU if you install qibojit with the cupy dependency (which is optional / not a requirement). The only issue is that it has been quite some time since we last used this so I am not if they still work, but they should, unless we did a very breaking change in the meantime.

For testing, both qibo and qibojit tests run on all available backends. So if there is a GPU in your machine and you have installed qibojit and cupy, this will be tested automatically. An easy way to confirm is to execute the tests with and without the

export CUDA_VISIBLE_DEVICES=""

variable and see that the number of executed tests is larger when the GPU is available (and not hidden). If you further install cuquantum this will also be tested.

alecandido commented 1 year ago

Ok, I completely forgot about qibojit-benchmarks. I guess what I need is:

https://github.com/qiboteam/qibojit-benchmarks/blob/main/scripts/multigpu.sh

and all its closure.

For testing, both qibo and qibojit tests run on all available backends.

I believe the MultiGpuOps does not really qualify as a backend itself, so you need some code making explicit use of it. https://github.com/qiboteam/qibojit/blob/0519c0e2273563d6439583fdca8d3010acb5dd19/src/qibojit/backends/gpu.py#L786 https://github.com/qiboteam/qibojit/blob/0519c0e2273563d6439583fdca8d3010acb5dd19/src/qibojit/tests/conftest.py#L5

stavros11 commented 1 year ago

Ok, I completely forgot about qibojit-benchmarks. I guess what I need is:

https://github.com/qiboteam/qibojit-benchmarks/blob/main/scripts/multigpu.sh

and all its closure.

I think it will work, with a few modifications, for example qibotf was archived and not maintained for quite sometime, so I would skip it. Also, this script was used in a machine with four physical GPUs. It is possible to test the multi-GPU even if you have a single GPU, by passing the proper option: accelerators=4/GPU:0 "reuses" the GPU four times, as if it was four seperate GPUs, but running sequentially.

For testing, both qibo and qibojit tests run on all available backends.

I believe the MultiGpuOps does not really qualify as a backend itself, so you need some code making explicit use of it.

In principle it is tested in qibo, because some tests have an accelerators argument. So if you use qibojit-cupy backend with accelerators it should trigger the MultiGpuOps but I believe we have never explicitly checked the coverage.

alecandido commented 1 year ago

In principle it is tested in qibo, because some tests have an accelerators argument. So if you use qibojit-cupy backend with accelerators it should trigger the MultiGpuOps but I believe we have never explicitly checked the coverage.

Indeed, I was trying to reconstruct the path and it turned out that part of the implementation of the multi-GPU is actually inside Qibo as DistributedQueues and relatives

https://github.com/qiboteam/qibo/blob/master/src/qibo/models/distcircuit.py

Most likely, I will have to reuse the accelerators dictionary. And the moment I want to modify something, I will have to touch Qibo as well (my plan was to eventually delegate everything to Dask, using GPUs as nodes, simply with a GPU backend, to directly implement the multi-node and multi-GPU consistently and in one shot).

I might still be able to just use Qibo, and avoid any modification, at least for the time being. But I'm still investigating.

alecandido commented 9 months ago

https://github.com/qiboteam/qibo/pull/1132