Closed hpssjellis closed 3 years ago
Hi @hpssjellis thanks for raising this issue. Can you help me understand what you are getting at with your Keras comment, are you looking for more examples for how to write your own Keras layers ? How to use the Keras layers we provide ? Or how the math works inside of our layers ?
On a related note have you had a chance to look at other tutorials that also contain lots of Keras layer usage (many of which are the 1-3 input/output type layers you describe): https://www.tensorflow.org/quantum/tutorials/hello_many_worlds#2_hybrid_quantum-classical_optimization and https://www.tensorflow.org/quantum/tutorials/qcnn#15_define_layers
Hi @MichaelBroughton thanks for the reply. I don't think any of the examples are close to what I want. Possibly what I am asking for cannot be done, but here goes:
Can you use a Quantum Computer to train a single generic hidden dense layer of a standard 3 layer Keras Model where the inputs are float32 and the outputs are float32 ?
Lets say 2 node inputs, 6 node hidden layer, 1 node output with softmax for output probabilities.
I want the Quantum Computer to improve the weights of the hidden layer. That's it.
Seems to be a paper about it https://arxiv.org/abs/1806.09729 but it looks like you helped write it.
Ahhhh ok, now I think I'm starting to understand a little better thanks for clarifying things for me.
Can you use a Quantum Computer to train a single generic hidden dense layer of a standard 3 layer Keras Model where the inputs are float32 and the outputs are float32 ? ....Lets say 2 node inputs, 6 node hidden layer, 1 node output with softmax for output probabilities.
So the short answer is no. The longer answer is "you could try this with a fair amount of effort, but it won't give the kind of boosts you're hoping for". The kinds of ideas we had in https://arxiv.org/abs/1806.09729 where you could use a quantum computer for a sort of 1:1 analog of Dense layers that might give you different/better training parameters was based on a hybrid platform of interacting qubits and (simulated) harmonic oscillators and was much more on the theory side of things (even implementing the XOR example in that paper took an enormous amount of compute power and we used 3 and 4 bit numbers, so going up anything larger like float32 is definitely out of reach for today's tech).
What people are trying out a lot with PennyLane (like in that example you linked) is something more along the lines of "sending our datapoints/intermediate tensors through a quantum circuit and seeing what happens" The optimization process is still completely classical and uses things like SGD (or whatever) and what happens is that your data becomes "momentarily quantum" inside your model. This isn't exactly the same as having a quantum computer optimize your parameters for you, but it's still somewhat interesting nevertheless. Here's a quick example for you:
import cirq
import sympy
import numpy as np
import tensorflow as tf
import tensorflow_quantum as tfq
def my_embedding_circuit():
# Note this must have the same number of free parameters as the layer that
# feeds into it from upstream. In this case you have 16.
# Can play around with different circuit architectures here too.
qubits = cirq.GridQubit.rect(1, 16)
symbols = sympy.symbols('alpha_0:16')
circuit = cirq.Circuit()
for qubit, symbol in zip(qubits, symbols):
circuit.append(cirq.X(qubit) ** symbol)
return circuit
def my_embedding_operators():
# Get the measurement operators to go along with your circuit.
qubits = cirq.GridQubit.rect(1, 16)
return [cirq.Z(qubit) for qubit in qubits]
def create_hybrid_model():
# A LeNet with a quantum twist.
images_in = tf.keras.layers.Input(shape=(28,28,1))
dummy_input = tf.keras.layers.Input(shape=(), dtype=tf.dtypes.string) # dummy input needed for keras to be happy.
conv1 = tf.keras.layers.Conv2D(32, [3, 3], activation='relu')(images_in)
conv2 = tf.keras.layers.Conv2D(64, [3, 3], activation='relu')(conv1)
pool1 = tf.keras.layers.MaxPooling2D(pool_size=(2, 2))(conv2)
dropout1 = tf.keras.layers.Dropout(0.25)(pool1)
flat1 = tf.keras.layers.Flatten()(dropout1)
dense1 = tf.keras.layers.Dense(128, activation='relu')(flat1)
dropout2 = tf.keras.layers.Dropout(0.5)(dense1)
dense2 = tf.keras.layers.Dense(16)(dropout2)
quantum_embedding = tfq.layers.ControlledPQC(
my_embedding_circuit(), my_embedding_operators())([dummy_input, dense2])
output = tf.keras.layers.Dense(10)(quantum_embedding)
model = tf.keras.Model(inputs = [images_in, dummy_input], outputs=[output])
model.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy'])
return model
hybrid_model = create_hybrid_model()
hybrid_model.summary()
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
# Rescale the images from [0,255] to the [0.0,1.0] range.
x_train, x_test = x_train[..., np.newaxis]/255.0, x_test[..., np.newaxis]/255.0
dummy_train = tfq.convert_to_tensor([cirq.Circuit() for _ in range(len(x_train))])
hybrid_model.fit(
x=(x_train, dummy_train), y=y_train,
batch_size=32,
epochs=5,
verbose=1)
This will train a LeNet with that has a "momentarily quantum" component to it on MNIST data. It will pump the second to last intermediate tf.Tensor
in the network through a quantum circuit via the TFQ ControlledPQC
layer. With this setup after 5 epochs you should definitly get 98+ accuracy. It's basically a larger scale example of what's in that PennyLane tutorial on a larger more compelling dataset (I tried to do this same thing in PennyLane and what would take on 30 mins on TFQ wound up looking like it'd be hours so I didn't bother waiting around).
With all that being said the literature on QML is moving away from these kinds of ideas. When people look at classical ML it's pretty great at dealing with data that comes from the classical world already. In this above LeNet example if you remove the quantum layer from the network, it will still perform about as well as it did with the quantum layer in it. Not for lack of trying, there isn't a lot of compelling work out there that tells us "Making your classical data momentarily quantum" OR "using quantum optimizers for you 1M+ parameter neural network" makes an appreciable difference on relevant problems (it might help on toy problems, but we want to try and search for relevant problems with TFQ). Maybe it's because academics aren't looking in the right places or thought the right way about it, but this area isn't looking super promising..... right now ;).
Recently (shameless plug incoming) we finished working on https://arxiv.org/abs/2011.01938 which explores in a little more depth why it might not be so easy for quantum computers to give immediate wins on classical data and that a shift to the paradigm of using your quantum computer to model quantum data (quantum systems, quantum sensors or things that measure at quantum scale) that is coupled to, or pre-existing on your quantum computers' qubits in some way, might be an avenue for QML to win out against classical ML (although even in that setting it's not so easy).
TFQ is very much still a research/tinker member of the TensorFlow ecosystem we want to encourage people to explore quantum data and help to build up more of an understanding of when and where QML can win. With that goal in mind we did tailor a lot of our onboarding materials (tutorials, api examples, research code, qml concepts page etc) around that.
We could put in another example about QC + Classical data, but we might need to caveat things in there with something to the effect of "While it is interesting to try these sorts of things, the community for QML has moved away from these kinds of ideas....." or something.
Does this help clear things up ?
As someone with experience on more of the applied/engineering side of the TF userbase do you think something like this in tutorial/example form would be valuable ? Maybe just to signal that QML is still very much geared towards research/tinkering and that the theory interest in this particular area isn't huge ? Maybe we could put this LeNet example on the research
branch of the repo ? Curious to hear your thoughts @hpssjellis , @zaqqwerty , @lamberta ?
Michael
Dear all,
few month ago I stumbled upon a similiar issue and with the help of Michael it finally worked, See issue PQC in the middle of network, contd. #267.
As a result I built a notebook for my students to demonstrate this kind of hybrid network. If it is interesting for the community I can share it of course....
One clarification, I am not looking for faster/slower, better/worse results, just proof that Quantum Computers can train regular Keras dense layers (If possible). My goal is to then put those results on other tensorflow models as a proof of interchangeability.
Please share your notebook @ghellstern I put the last code I found from #267 on my github https://github.com/hpssjellis/my-examples-for-quantum-computing/blob/main/tensorflowQuantum/a08-ghellstern01-py and yes it works, but as with all quantum keras layers, I can't sensibly save the model.
By the way @MichaelBroughton I can save the model with this trick saving the weights, creating a pure keras model loading the weights and saving that model, but the data is irrelevant as the layer was Quantum trained not Keras trained.
Thanks for chiming in @ghellstern I had forgotten there were lots of useful snippets in #267 !!
@hpssjellis :
Just proof that Quantum computers can train regular Keras dense layers ( if possible ). My goal is to then put those results on other tensorflow models as a proof of interchangeability.
If you want the quantum computer to take care of the entire training portion then neither the example you linked from PennyLane or the snippet I just provided accomplish that. In general getting the entire training algorithm to be fully quantum like this one: https://arxiv.org/abs/1806.09729 simply won't be viable at the scales you want (i.e. bare minimum compatibility with TensorFlow and Keras float32 precision. Even float16 or float8 precision are way out of reach, so we'd need to wait until we get a bigger quantum chip actually built to try ideas like this one at these sorts of scales with circuit model quantum computers). So training and then interchanging layers in this way won't be viable.
Circling back to your earlier point:
but I have had to go to PennyLaneAI to find good simple Keras layers examples that abstract away the Quantum Math when training reasonably easy Keras Layer Models
Both demos make use of the keras API for classical training loops with "momentarily quantumifying" components inside of the tf.keras.Model
instances they are training, making them thematically pretty much identical to one another in terms of training loop and algorithmic logic inside the model architecture itself (the only thing different about them is the datasets and quantum circuit parameterizations used). The PennyLane tutorial isn't employing a fully quantum training algorithm and neither are we. So if you are looking for fully quantum versions of the entire training loop neither of these examples will do that for you, but if you are looking for the same functionality as the PennyLane tutorial you linked implemented in TFQ then the snippet I gave does that.
Is the concern here that TFQ doesn't have many snippets like the one I just gave or the ones found in the PennyLane tutorial, or is this also related to your first question ?
By the way @MichaelBroughton I can save the model with this trick saving the weights, creating a pure keras model loading the weights and saving that model, but the data is irrelevant as the layer was Quantum trained not Keras trained.
I'm not sure I understand what you're getting at here, but you can check out https://stackoverflow.com/questions/60682091/cannot-save-tensorrflow-keras-quantum-model-using-save-pickle for information that might help with model saving ?
@hpssjellis: I added the notebook I mentioned to github: https://github.com/ghellstern/QuantumNN/blob/master/Multi-QBit-Classifier%20TF%20NN-Encoding_Github.ipynb Presumably it's not the most elegant way to build a hybrid network; but at least a proof of concept. It would be great to have a discussion about possible improvements ....
@MichaelBroughton: By the way - in the last month I tried different methods to connect tensorflow with Qiskit (not via PennyLane which is possible but awful slowly). Why Qiskit ? Because in the long run it would be nice to try this stuff on a physical device and I have access to the IBM hardware. As far as I realized the situation unfortunately neither IBM (which prefers Pytorch) neither Google (cirq is a competitor of Qiskit) is interested to promote this :-)
@ghellstern " in the last month I tried different methods to connect tensorflow with Qiskit" That really should be it's own thread. I agree, testing Quantum computing on a simulator is nice but when https://www.ibm.com/quantum-computing has made things so easy, it is a shame that we can't easily connect to it. The other option is that somehow Google makes easy public access to a few of its older Quantum Computers. Since my background is with TensorflowJS, I would really like to use Javascript to connect to a quantum computer. I think https://www.rigetti.com/ has a javascript connection but it is not easy.
Thanks @MichaelBroughton for explaining the issue with training a Keras layer, I was getting a bit discouraged, but feel much better, now knowing that it is the size of a float32 that is causing the problem. In TensorflowMicro we are always looking at int8 quantization to reduce the size of our models, Is it possible that an int8 trained layer could work on a present Quantum Computer?
Since I am not interested in the model actually being better, I just want to see if it is possible, could we train a layer using int8 and then map those weight values to float32? I would even be fine with 4 bits (int4?)
Quote from @MichaelBroughton "In general getting the entire training algorithm to be fully quantum like this one: https://arxiv.org/abs/1806.09729 simply won't be viable at the scales you want (i.e. bare minimum compatibility with TensorFlow and Keras float32 precision. Even float16 or float8 precision are way out of reach, so we'd need to wait until we get a bigger quantum chip actually built to try ideas like this one at these sorts of scales with circuit model quantum computers). So training and then interchanging layers in this way won't be viable."
Is it possible that an int8 trained layer could work on a present Quantum Computer?
I'm not entirely sure if you are still referring to doing the full training algorithm on a quantum computer or are now referring now to something more like just inference. In either case the answer is still no. In the case of the full training algorithm it's totally out of reach for today's hardware, in the case of inference it is also still pretty out of reach, but maybe a little less so because you don't have as much going on in the case of doing the training loop in a fully quantum fashion. Expanding on my earlier point: For starters, lets ignore all of the nuance and complexity in carrying out a physically realizable encoding of an int8 or float8 number onto eight qubits in the manner done here (https://arxiv.org/abs/1806.09729) and ignore all the nuance and complexity of carrying out addition and multiplication on a quantum computer (Which already makes this virtually impossible on today's quantum computers). Float8 and int8 both use eight bits to store a number. So following along with our paper you could store one number per eight qubits. Even just to store four numbers (without beginning to think about operating on them) would require 4 * 8 = 32
qubits which is starting to approach the limit of most state-vector style simulators. Add one more number in there and you're up to 40 qubits which puts things out of reach for most simulators and most actual quantum computers out there. So I guess if you REALLY wanted you could implement a faithful int8/float8 encoding and training algorithm on a quantum computer that only used at most four or five eight bit numbers (basically leaving you with room for the addition of two numbers in the forward pass and the storing of that additions gradient in the backward pass). This seems like a lot of work to add two numbers, and don't forget we've already basically ignored the fact that the gate depths, connectivity and other current day device limitations already prohibit us from getting anywhere close to doing this.
There are more ad-hoc approaches (none of which are completely quantum in nature and all have some form of classical feedback/assistance) that let you do inference and training that involves your quantum computer in certain small stages by "momentarily quantumifying" your numbers in the form of qubit rotations followed by some gates and then a measurement. These "momentarily quantumifying encodings" are not 1:1 with carrying out the actual int/float encodings and arithmetic operations on your quantum computer. Both the pennylane example you linked and the snippet I wrote are examples of such ad-hoc approaches and are not "fully quantum" in nature at all. Even to get these kinds of algorithms to run reliably on today's quantum computers is a tall order (people write papers about it: https://arxiv.org/pdf/2012.04145.pdf).
Again, I would like to circle back and understand your earlier statement, could you please elaborate here ?
but I have had to go to PennyLaneAI to find good simple Keras layers examples that abstract away the Quantum Math when training reasonably easy Keras Layer Models
Both demos make use of the keras API for classical training loops with "momentarily quantumifying" components inside of the tf.keras.Model instances they are training, making them thematically pretty much identical to one another in terms of training loop and algorithmic logic inside the model architecture itself (the only thing different about them is the datasets and quantum circuit parameterizations used). The PennyLane tutorial isn't employing a fully quantum training algorithm and neither are we. So if you are looking for fully quantum versions of the entire training loop neither of these examples will do that for you, but if you are looking for the same functionality as the PennyLane tutorial you linked implemented in TFQ then the snippet I gave does that. Is the concern here that TFQ doesn't have many snippets like the one I just gave or the ones found in the PennyLane tutorial, or is this also related to your first question ?
I would like to figure out of this means we need a feature add somewhere or more docs/explanations on our codebase etc.
LOL. I really do appreciate the extensive explanation. We can close this issue.
I learn from good code examples so more examples are always better, but as you have so clearly pointed out, todays quantum computers are a long way off from having enough qubits to do, what I want to be able to do with them.
Once again @MichaelBroughton thank you so much for spending the time to answer my question.
I am a die hard Tensorflow fan (I have spent years on both TensorflowJS and TensorflowMicro) but I have had to go to PennyLaneAI to find good simple Keras layers examples that abstract away the Quantum Math when training reasonably easy Keras Layer Models. Could TFQ provide a few more examples in this area. Preferably 1-3 input nodes and 1-3 output nodes.
The https://www.tensorflow.org/quantum/tutorials/mnist gets too caught up in the image manipulation to be very useful.