Device and data preparation

mickahell commented 3 years ago

Preparing datas

_device.py : includes device circuit functions, e.g. blochhemisphere(theta, phi), classicalstate(bitarray), etc.

First preparing bunch of parameters and define them as Qats (north half of a qubit) or doQs (south part of a qubit)

ex :

RX(x)	RZ(x)	Qat/doQ
π	0	doQ
π/3	0	Qat
2*π/3	2*π/3	doQ

Create the device

The device who create a qubit at the state |0> then will be apply the gate and parameters gave in by the external file ex :

|--------------------|
|       Device       |
|---------|--|-------|
    | RX(π/3) |--| RZ(0) |----- |S(x)>
|---------|--|-------|  
|--------------------|

the output of this function will be the qubit in superposition

mickahell commented 3 years ago

Idea from @tvarga78 :

Device types
------------
1) Unsupervised learning:
   The device outputs quantum states of unknown labels, our task is clustering. We'll see what
   happens using a simulator.
2) Supervised learning:
   We have a labeled training set of device output states. Our task is to train a classifier
   which learns the unknown classification rule. Variants:
   2a) During operation, any given device output can be reproduced, as many times as we wish,
       because we can reproduce the physical circumstances (parameters) that caused the device to
       produce the given output. This would enable us to apply data re-uploading strategies during
       classification.
   2a-alternative) Let's have M identical devices, i.e. let the classifier NOT have access to the 
       parameters of the device circuit (as we cannot just clone M times the output of a single 
       device). Then, a physical event makes them output the very same output. So we can 
       structure our classifier circuit to make use of M identical device outputs.
   2b) During operation, the device outputs cannot be reproduced, because for some reason it's
       not feasible to reproduce the physical circumstances (parameters) mentioned above. So we
       cannot apply data re-uploading strategies. However, during the training session we are in
       full control of the device's parameters in our lab.

muttley2k commented 3 years ago

So basically, the point in 2a and 2b is whether we can apply data re-uploading or not.

I like the data re-uploading idea a lot: https://arxiv.org/abs/1907.02085

mickahell commented 3 years ago

In fact during the training part, you know the data you sent as input so you can produce the same qubit anytime you want

muttley2k commented 3 years ago

In fact during the training part, you know the data you sent as input so you can produce the same qubit anytime you want

Yes, but in 2b, during operation, i.e. during testing, I don't know, the device is taken from our nice lab. So when programming the classifier, we cannot take it for granted that we can reproduce the device's output as we wish.

muttley2k commented 3 years ago

Here is the summary of what we discussed. We target 2 tasks: one is higher priority, the other is lower. The idea is that we make progress with the higher priority task until Wed.

1) High-priority task: 2b

Idea: we classify regions of the Hilbert space. E.g. the Northern hemisphere of the Bloch sphere is "cat", the Southern hemisphere is "dog". As first step, we create a parameterized device that can create state vectors that fall either into the "cat" or "dog" region. Then, we feed the output state of the device into a parameterized classifier circuit (see ansatz in VQE 100 exercise for one way of how we can make such a circuit). Then, we optimize the classifier same way as in the Circuit Training 500 exercise. (We emphasize that we are NOT classifying the classical params vector of the device... we could use any other device with different parameterization as long as it's capable of producing "dog" and "cat" states.)

Catch: during operation, we cannot calculate expected values, because the device can run only once. So when we calculate the accuracy on the test set, we are not allowed to calculate expected values... we can run the circuit just once, make the PauliZ measurement, and if the result is "-1" we say "dog", if it's 1 we say "cat". (On the other hand, in the training phase we can optimize the circuit parameters, the thetas, using expected values of the measurements, as training is done in our laboratory. But operation may happen e.g. in a self-driving car.)

Compare with: the standard way of testing, that is, when we can run the circuit multiple times to get expected value of the PauliZ measurement. The result can be e.g. 0.5, then we classify it as "cat", as it's closer to 1 than to -1. (So, you see that when we can run the circuit only once, we can be unlucky that we get -1, even if the expected value over many runs is 0.5.)

Result to get: (by Wed) how much does the accuracy drop due to the fact that we can run the circuit only once during testing? (Also, can we use a smart cost function during training that will make the accuracy drop less?)

2) Low-priority task: 2a_alt

Idea: mingling the device with the classifier, that is, running the device again and again in every layer will make our classifier more powerful. See paper here.

Result to get: (by Fri) compare with "traditional" approach of having the device only at the beginning of the circuit. How much does the accuracy increase? (As opposed to the high-prio task, here we can use expected values during BOTH training and testing.)

Story: there are M identical devices, located very close to each other. Each device has exponentially many parameters, e.g. because it has exponentially many layers of gates (see ansatz in VQE 100 exercise). When a physical event happens (e.g. a huge gravitational wave goes through the devices), it will set all the exponentially many parameters of the M devices at once, identically, as the devices are close to each other. Then, the parameters are fixed by the event and don't change until the next event (each device is a black box for us, we cannot see the parameters, and there are too many of them anyway). So we can assemble our circuit with M layers as you see in the picture, and run it as many times as we wish (assuming the next event won't happen for some time).

Zed-Is-Dead commented 3 years ago

Why not using a general rotation gate U3(theta, phi, lambda) for the state preparation? I would suggest restricting the rotation angles to a certain region of the Bloch sphere and drawing them at random. We assume here to deal with pure states - in real life, we would have to consider mixed states/density matrices

mickahell commented 3 years ago

Of course @Zed-Is-Dead the schema I wrote is just a suggestion and using a U3 is absolutly good too.

Zed-Is-Dead commented 3 years ago

Example of states (100 QCats in red and 100 QDogs in blue) generated at random with some restriction on angles

mickahell commented 3 years ago

@Zed-Is-Dead this is perfect for graphing at the end !

mickahell commented 3 years ago

@Zed-Is-Dead I'll update the readme with this image

Zed-Is-Dead commented 3 years ago

I have pushed a notebook that generates random qcats and qdogs (cf. picture above). Also 2 csv files with the cats/dogs angle parameters Instructions are given at the end on how to recreate the qcats/qdogs from the respective angles

mickahell commented 3 years ago

I just used a function in an excel file to generate them in the same time ^^'

Zed-Is-Dead commented 3 years ago

I've updated with training/testing data (100 qats + 100 doqs in training and 25/25 for testing)

mickahell commented 3 years ago

I pushed a simple U2 sensor, taking in params a simple arrays [x, z]

mickahell commented 3 years ago

EDIT, it's now a template circuit who can take an array of 2 or 3 angles (doesn’t matter is the data are 2 or 3 angles)

muttley2k commented 3 years ago

Idea: we classify regions of the Hilbert space, of quantum states of n qubits. There are 2 categories, "Qat" and "DoQ". As an example, for n=1, one hemisphere of the Bloch sphere could be labelled "Qat", the other hemisphere "DoQ". The state vectors to classify are generated as the output of a sensor, which is then fed into a classifier circuit of M layers. Note that we are NOT classifying the classical params vector of the sensor, as we could use any other sensor with different parameterization as long as it's capable of producing Qat and DoQ states. Also, we take the sensor as is, we don't try to "optimize" it.

Catch: during operation, the sensor can only produce its output once. Thus, when we calculate the accuracy on the test set, we are not allowed to make use of expected values resulting from many shots. There is only 1 shot (in the training phase, we can optimize using expected values, as training is done in our laboratory where we can recreate the sensor outputs of the training set at will). We'd like to experiment how much the accuracy drops due to this 1-shot limitation, whether it's different using simulator vs real quantum hardware, and what kind of cost function would reduce this impact.

Extra: if multiple shots are allowed, how much would a data re-uploading scheme improve the accuracy? E.g. imagine there are M identical sensors located very close to each other. When a certain physical event happens, it sets all the parameters of the M sensors at once, identically for each sensor. Then, the parameters don't change until the next event. Furthermore, there may be exponentially many parameters of the sensor, inaccessible to us. So again, we are classifying quantum states.

Quantronauts / qhack21

Device and data preparation #19

Preparing datas

Create the device