[Discussion] Dataset integrations

tensorflow / quantum

Hybrid Quantum-Classical Machine Learning in TensorFlow

https://www.tensorflow.org/quantum

Apache License 2.0

1.79k stars 572 forks source link

[Discussion] Dataset integrations #277

Open MichaelBroughton opened 4 years ago

MichaelBroughton commented 4 years ago

We recently had our first outside contributor add to TFQ datasets. Big thanks @therooler and the rest of the U of T folks! Now that we have our first example in place, let's use this issue to plan out the new datasets we want to add in an open discussion. To my knowledge these sorts of things were up for discussion (correct me if I'm wrong @therooler ) :

~1d TFI chain~
2d TFI grid
TFI on a star
XY model on chain (Mentioned in meeting, but I'm not 100% on this) ?
XY model on grid (Mentioned in meeting, but I'm not 100% on this) ?

Other talking points:

What sorts of other spin system style models do we want ? (AKT, XYZ, XZX etc. )
What sorts of properties should we incorporate as dataset labels ? (paramagnetic, paramagnetic, ground state energy, desired unitary etc. )
What sorts of other NON spin system style datasets do we want ? (perhaps an ensemble of noisy circuits which all correspond to some hidden unitary we wish to realize. Maybe a series of circuits containing time correlated errors that we need to identify)

refraction-ray commented 4 years ago

Hi, I am just curious what is the format for these quantum dataset, i.e. are these quantum data stored in the form of a circuit or wavefunctions, since tfq seems to only naturally interact with data in the form of preparation circuits (i.e. data = tfq.convert_to_tensor(circuit)).

For quantum state data, there are commonly three types of expressions, wavefunction/tensornetwork/preparation circuit from zero states. For different quantum states, the natural simple and compact types of expressions are different. For example, some data are obtained from ED/DMRG calculations and some others are more easily expressed in terms of circuits operations.

Is there a simple and constructive way to build efficient preparation circuits from other expressions like wavefunction or MPS (MPS seems to be possible)? If so, any references on that? If not, then how to incorporate quantum data which is better expressed in the form of wavefunctions or tensor networks in tfq?

MichaelBroughton commented 4 years ago

Good question! Some initial thoughts: In general a library like Cirq encourages research/experiments that are relevant for the NiSQ era (though it is possible to do much more than just this). This places the focus on Circuits specifically ones that are constant or log depth (polynomial is fine too). That should also be our focus here.

With that in mind I'm not aware of any general tools that let one go from wavefunction -> a small circuit or from some tensor network -> a small circuit. With our tfi_chain dataset we opted to use a "reasonable linear depth VQE circuit" to prepare the states with a few 9s of fidelity to the states that were calculated using analytic methods.

That's not to say this is the only way to do things. Looking at the case of the AKLT model (which has a nice MPS representation) there could be several different approaches to get to a concrete circuit. Maybe one following the methods outlined in here ( https://arxiv.org/pdf/1707.05787.pdf ) or maybe another "reasonable VQE circuit" is the way to go. In general we're pretty open to many different approaches. One goal with curating these datasets is to provide a nice standard people can test their models with and there is no "best way" to do these sorts of things :).

refraction-ray commented 4 years ago

@MichaelBroughton , thanks for your answer. It's glad to learn that VQE can keep very good tradeoff between circuit depth and output fidelity. If this is always the case, then VQE is a very promising general approach bridging the gap between other format of states with preparation circuits here. (Though it is subtle for highly entangled state, I guess, which may require deeper circuit to achieve high fidelity)

therooler commented 4 years ago

Thanks, we're excited to be a part of the TFQ data development process! Let me summarize our results again:

For these models we actually have the data: 1.) 1D XXZ chain (closed) for N=4,8,12,16 with a depth N circuit for Delta in [0.5,1,5]. We can solve both the chains (TFI and XXZ) with open boundary conditions. But maybe this should not be a priority, because the important physics is similar. For these models we actually have the data: 2.) 2D TFI model on 3x3, 4x3 and 4x4 lattice for g in [2.0, 4.0] 3.) 2D XXZ model on 3x3, 4x3 and 4x4 lattice for in [2.0, 4.0]

I would have to check the consistency of the data and clean some stuff up, but those should be implementable in the short run. From a research standpoint these 2D models are very interesting. I think solving a big (5x5, 5x6) 2D TFI model could be a good testing ground to test TFQ/qsim's capabilities. I chatted a bit with Alan a month ago about using cirq for 30+ qubit systems, but I haven't had the time to look into that until now.

We're still actively researching the Kagome (star) lattice models, and have solved these for a handful of order values. Due to this being active research as well as the high computational cost required for solving these we should probably put those on the back burner.

I'll think a bit about the other points you mentioned and I will discuss it with the rest of our team.

therooler commented 4 years ago

Maybe I can also summarize some of the discussion we had about the labels for the datasets for the spin_systems module in TFQ, so that people have some idea what the current setup is. Right now, we supply a quantum circuit with resolved parameters that corresponds to the ground state wave function of a Hamiltonian for a certain order parameter. In addition, we return the exact 2^N ground state and its corresponding energy, together with a label that classifies the phase (0=paramagnetic, 1=critical, 2=ferromagnetic, etc....).

We described two example use cases for how this data could be used:

Use case 1:

By using a Quantum Neural Network, the user wants to classify the phase of the system. To set this up, they import the variational circuits with resolved parameters from TFQ, and feed the states from these circuits to our QNN. There is no need for the user to run their own optimization algorithms (which for N>=16 can take hours for each system) to find the parameters.

Use case 2:

A user wants to find a novel low depth quantum circuit to represent the ground state. Using the data set, they can quickly obtain the exact ground state and energy as ground truth, without having to diagonalize hundreds of systems themselves (a costly procedure for larger systems). With the supplied additional data the user can benchmark their algorithm.

github-actions[bot] commented 4 years ago

This issue has not had any activity in a month. Is it stale ?

therooler commented 3 years ago

I just had an idea about a potential data set that we could easily implement.

Right now, what I have done with the spin_system module is to add data sets for which we know the variational circuit + parameters that gives us the ground state. The reason I haven't added any more for a while is because we are having trouble with reliably solving the ground state problem with a single circuit architecture for different system size/phase for other physical models.

So what if instead we create a data set that contains the Hamiltonians and ground state energies for a bunch of spin systems for which no variational circuit + optimization solution has yet been reported in the literature? This could give people an easy way to test their favorite variational algorithm. And even if there is already a solution known, this could enable people to benchmark different algorithms and circuit architectures in a standardized way. I am thinking in the direction of how people report stuff in the ML literature, where you get a table of MNIST, ImageNet, CIFAR, etc. scores that a novel method is outperforming the state-of-the-art on. Maybe this doesnt work that well for quantum circuits, but that is kinda of the idea.

More concretely, for each model, we create an interface similar to the spin_system one, where we ask the user for the system name (TFIM, XXZ, XY, Kitaev, Haldane-Shastry,..), system size, order parameter (can be a single parameter or more depending on the physical model), boundary condition (open, closed, torus, cylinder,...), topology (1d, 2d, triangular,...) and then as output you get a cirq.Paulisum, and the ground state energy at the specified point in the phase diagram (we will have to calculate this for the user). T

This cirq.Paulisum can then be used to run VQE type algorithms in TFQ. The ground state energy can be used to check that you reached the ground state.

Would this be interesting? Let me know what you think!

MichaelBroughton commented 3 years ago

I think this would be very useful to have! It might be worth writing a little design doc / RFC for how you envision this feature would fit into TFQ. My knee jerk reaction would be maybe these would be some kind of "less structured" class of datasets since we don't provide the user with as much information. It would be good to put this in writing and plan out some different possible implementations too! If we don't have to provide the solved circuit how feasible is it to generate this data on the fly ?

therooler commented 3 years ago

I'll try to write something up. I am thinking something along the line of how it's done in OpenFermion: https://quantumai.google/reference/python/openfermion/hamiltonians