Closed visagim closed 1 year ago
@DiegoGM91 could you please check? In principle you need 128 * 2^28 bits ~ 4.3gb for a state with 28 qubits.
Sorry I missed a factor of 8, it should indeed be ~4.3gb as @scarrazza says (but not ~16gb)
It also does not seem necessarily related to QAOA:
from qibo import hamiltonians
nqubits = 28
# Create hamiltonian
hamiltonian = hamiltonians.XXZ(nqubits, dense=False)
state = hamiltonian.backend.plus_state(nqubits)
exp = hamiltonian.expectation(state)
print(exp)
Also allocates 16688MiB to GPU.
If we check the expectation code, there are at least 4 objects with state size 2**28 (see code here): state, statec, hstate, matrix @ state (this last one requires at least 1 copy when performing the multiplication), so I believe the 16gb is justified.
That makes sense, thanks!
Is there any way to a priori calculate GPU memory needed to simulate a circuit with a specific backend (eg. cupy)? For example to simulate a simple QAOA circuit from the examples:
With double precision I would expect this to need around 128*2^28 / 1024^2 = 32,768 MiB of memory, however it allocates 16,688 MiB instead, which is about a factor of 2 less. The GPU I have available has at most 24,576 MiB, so it couldn't possibly allocate more than that.
I'm using qibojit with cupy backend. qibo 0.2.0.dev0 qibojit 0.0.7.dev0 cupy 11.4.0 CUDA 11.8
I'm guessing there is some optimization going on, I would like to find the documentation for it if you can point me to it.