PennyLaneAI / pennylane

PennyLane is a cross-platform Python library for quantum computing, quantum machine learning, and quantum chemistry. Train a quantum computer the same way as a neural network.
https://pennylane.ai
Apache License 2.0
2.29k stars 591 forks source link

Prevent writing to a specific file when creating the Hamiltonian #5785

Open minhtriet opened 4 months ago

minhtriet commented 4 months ago

Feature details

When doing gradient descent for optimizing the coordinates of molecules, it is desirable to create many Hamiltonian quickly. It is because we are shifting every coordinate just by a tiny bit and regenerate the Hamiltonian

Right now, Pennylane is creating a file, for my case molecule_pyscf_sto-3g.hdf5, and other Hamiltonians created would just overwrite this file.

Implementation

tempfile.NamedTemporaryFile should be able to help us. I am willing to discuss further and author a PR :)

How important would you say this feature is?

2: Somewhat important. Needed this quarter.

Additional information

Test case pseudocode (adapt from real code)

 def test_hamiltonian_construction(self):
        symbols = ['H', 'H']

        # Known coordinates for H2 molecule
        np.random.seed(5)
        coordinates_list = np.random.random((18,6))

        # Call the parallel_build_hamiltonian function
        start_parallel = time.time()
        hamiltonians = parallel_build_hamiltonian(coordinates_list)
        done_parallel = time.time()

        start_serial = time.time()
        expected_h1 = qchem.molecular_hamiltonian(symbols, coordinates_list[0])
        expected_h2 = qchem.molecular_hamiltonian(symbols, coordinates_list[1])
        done_serial = time.time()

        # Check the results
        assert done_parallel - start_parallel < 0.65 * (done_serial - start_serial)
        # assert correct hamiltonians and num qubits
soranjh commented 4 months ago

Thanks @minhtriet for creating the issue. Could you please benchmark this for a few molecules and provide the timing results here? This will help us to see how much improvement we gain. Thanks.

minhtriet commented 4 months ago

Thanks @minhtriet for creating the issue. Could you please benchmark this for a few molecules and provide the timing results here? This will help us to see how much improvement we gain. Thanks.

sure, let me implement that and benchmark

minhtriet commented 4 months ago

https://github.com/PennyLaneAI/pennylane/pull/5792/ Please review guys :)

CatalinaAlbornoz commented 4 months ago

Hi @minhtriet, Before reviewing the PR it would be good to know the results of running your code for several molecules. Eg: how long does it take to run this for H2? And LiH, H2O, and N2?

minhtriet commented 4 months ago

Hi @minhtriet, Before reviewing the PR it would be good to know the results of running your code for several molecules. Eg: how long does it take to run this for H2? And LiH, H2O, and N2?

Are we interested in the actual time, or just how many percent faster parallel mode is in comparison to serial mode? Right now I measured that it reduce the speed needed for NH3 by at least 40%.

Can do with LiH and H2O. For H2 it parallel mode is slower than serial mode

CatalinaAlbornoz commented 3 months ago

Knowing the actual times would be ideal @minhtriet . Even in cases where there's a slowdown.

minhtriet commented 3 months ago

Hi, here is the update

PASSED [ 20%]['N', 'H', 'H', 'H']
Parallel: 8.693075180053711
Serial: 26.220473051071167

FAILED [ 40%]['H', 'H']
Parallel: 1.5865538120269775
Serial: 0.07684993743896484

FAILED [ 60%]['Li', 'H']
Parallel: 2.3994500637054443
Serial: 1.811594009399414

PASSED [ 80%]['H', 'H', 'O']
Parallel: 3.6580381393432617
Serial: 7.1287360191345215

PASSED [100%]['N', 'N']
Parallel: 5.042972803115845
Serial: 11.95625662803649

The set up and test code can be found at https://github.com/PennyLaneAI/pennylane/blob/5a9a2d4e9d4c16c24fcdf142a4cd4a14144f761a/tests/qchem/of_tests/test_molecular_hamiltonian.py#L1350

CatalinaAlbornoz commented 3 months ago

Thank you @minhtriet ! This is very useful info.

CatalinaAlbornoz commented 3 months ago

@minhtriet Do you want to make a Pull Request (PR) to implement this? Have you done a contribution to PennyLane before? This blog post is a good guide on how to contribute.

If you're interested you can start working on it already and we'll review it in the next few weeks. We won't be able to review it immediately but we'll review and provide feedback as soon as we have capacity.

Let me know if you have any questions!

minhtriet commented 3 months ago

5792 Please review guys :)

Sure @CatalinaAlbornoz. Here is my PR

CatalinaAlbornoz commented 3 months ago

Thanks @minhtriet ! It may take us a few weeks to review but we will get back to you. In the meantime, I noticed that there seem to be conflicts with a file. You need to resolve it using the command line. image

mlxd commented 3 months ago

Hi @minhtriet I noticed the title and details of this issue do not match with the content or expectations set out by the title. I have also provided a review for the PR #5792 which seems to be concerned with using multiprocessing to handle the Hamiltonian building in parallel, which differs from the temporary file issue discussed here.

Since the reported performance numbers show regressions for smaller systems, I think this will need to be mitigated for an implementation that provides the support to an end user. I'd recommend considering either updating the expectations and issue reported here to match the parallel work, or to consider the PR to implement the temporary file work.