[Feature, Performance] Optimize HamEvo for commuting Hamiltonians

jpmoutinho commented 4 months ago

Currently HamEvo of some generator is always exponentiated fully, without checking for commutation relations. However, generators composed of several commuting parts can be exponentiated separately. We can make some of those checks automatic in the instantiation of HamEvo and then optimize the calculation in the backend. Below is an example script doing it manually to showcase the potential:

from qadence import X, Y, Z, N, kron, add, CNOT
from qadence import HamEvo, run, chain, random_state
from qadence import equivalent_state, hamiltonian_factory

import time

n_qubits = 10

# Consider a generator composed of commuting terms, e.g. a neutral atom interaction on a line register
op = N
gen_full = add(N(i) @ N(i+1) for i in range(n_qubits - 1))

# Let's split it in two parts, such that gen_ful = gen_part_1 + gen_part_2
gen_part_1 = add(op(i) @ op(i+1) for i in range(n_qubits // 2))
gen_part_2 = add(op(i) @ op(i+1) for i in range(n_qubits // 2, n_qubits-1))

# Because the two generators commute, they can be exponentiated separately as smaller generators
op_split = chain(HamEvo(gen_part_1, 1.0), HamEvo(gen_part_2, 1.0))

# Here we exponentiate the two together
op_full = HamEvo(gen_full, 1.0)

# Some random initial state
state_init = random_state(n_qubits)

start = time.time()
state_split = run(n_qubits, op_split, state = state_init)
mid = time.time()
state_full = run(n_qubits, op_full, state = state_init)
end = time.time()

# Check that indeed we get the same result
assert equivalent_state(state_split, state_full, atol = 1e-12)

# Check the time taken for both calculations
print("Split exponentiation: ", mid - start)
print("Full exponentiation: ", end - mid)

Split exponentiation:  0.010470867156982422
Full exponentiation:  0.6789381504058838

To implement this, there is already a block_is_commuting_hamiltonian function in qadence.blocks.utils that would be useful. It seem efficient, but should be reviewed. This function just returns True or False, so a first implementation could be based on that: essentially, if True, every term in the AddBlock is separated into its own matrix to be exponentiated.

A next level would be to look for optimizations even if the function returns False, where each non-commuting term would be aggregated into its own group. However I suspect this would not be straightforward.

Related to https://github.com/pasqal-io/qadence/issues/134 since this would reduce to calling block to tensor on each of the smaller commuting terms.

jpmoutinho commented 4 months ago

@vytautas-a like we discussed, I suspect a first implementation of this may only require dealing with the HamEvo class in qadence.operations, but I'm not sure about a clean way to do it. The digital_decomposition method is relevant.

The HamEvo instantiates a TimeEvolutionBlock which then is handled by the convert_ops.py in the pyqtorch backend (also horqrux backend), so all the code needed should be in these files.

rajaiitp commented 4 months ago

HamEvo checks if the qubit hamiltonian is made of commuting terms and generates a gate representation of it from what I can see. So why does it take different execution times if the generator is specified as one chunk or separate commuting chunks when both have identical digital decomposition to be implemented in the end

jpmoutinho commented 4 months ago

HamEvo checks if the qubit hamiltonian is made of commuting terms and generates a gate representation of it from what I can see. So why does it take different execution times if the generator is specified as one chunk or separate commuting chunks when both have identical digital decomposition to be implemented in the end

Actually it currently only does that if you call HamEvo.digital_decomposition(). Otherwise it builds the full matrix for the Hamiltonian and then exponentiates it, which is why it's faster in my example: there it is exponentiating two smaller matrices instead of a large one.

But the suggestion here is exactly to make the commutation check be done automatically and optimize based on that.

rajaiitp commented 4 months ago

Cool, understood. I would like to add a few points with designing the algorithm that I thought about

If the commuting operators act on the same set of qubits, it is good to keep them together, eg Z_1Z_2+X_1X_2
If there is a generator term with support that encompasses the support of other terms, then they need to included with it eg Z1Z2Z3 + Z2 +Z3
Start with the Z basis, if 2 generators have only a small overlap keep them separate like Z_1Z_2 separate from Z_2Z_3. Use other noncommuting generators like X_2 as a glue. for these separate groups to come together.

rajaiitp commented 3 months ago

Found a paper about distributing a pauli decomposition into commuting set of collections, https://arxiv.org/pdf/1908.06942.pdf

Apparently there is a lot of work on this mostly in the direction of doing efficient observable estimation with minimal shots, but can be used here as well, i think

jpmoutinho commented 3 months ago

Nice @rajaiitp ! Seems relevant yes.

jpmoutinho commented 3 months ago

Also came accross this one: https://arxiv.org/pdf/1907.09040.pdf

jpmoutinho commented 1 month ago

Closing after opening in PyQTorch: https://github.com/pasqal-io/pyqtorch/issues/177

pasqal-io / qadence

[Feature, Performance] Optimize HamEvo for commuting Hamiltonians #346