Open JavariaGhafoor opened 1 year ago
Hi, thank you for submitting your algorithm. However we have several questions on it.
If there is no response by 23:59 on Tuesday, August 8th (JST), the score may be changed at the discretion of the management due to a possibility of contact with the prohibited item "Hard-coded good parameter sets" described in README.md.
np.random.uniform(0.9, 1)
) and adding it in find_final_cost(cost)
? Also when calculating weighted average we need to divide by the sum of the weights, but why haven’t you done that?0.9
and 1.1
in find_final_cost(cost)
? There is a possibility that these parameters are hard-coded to only work for this specific setting.We would like you to clarify these points.
Hi,
Although I am busy with travel (stuck at Doha Airport due to recurrent delay of the same flight, now cumulating over 44 hrs), and I don't have my laptop's charger, nor do I have my tablet, as they are submitted in checked luggage, so I'm compiling this comment on my phone; I'll try my best to respond to your questions.
Following is the extensive explanation of stuff followed by responses to the stated questions at the end.
Extensive Explanation of things:
If you see the diagram in the presentation, I have made a well of which we are interested in finding the minima. "This type" of a well is only attainable for the number of shots to be approx. equal (in 100s) to the minimum number of shots that do not return garbage values in each iteration of VQE alongwith the Ansatz to be such that the parametric rotation gates and CZ gates are applied only in the first half of the circuit.
By "this type" I mean a well with a larger ranged points on it on the right side of the minima (the side from which we are approaching the minima) and lesser ranged points on the left side of the minima. And this is true for any number of qubits; a reason why for 4 qubits I have set number of shots equal to 100, and for 8 qubits I have set number of shots equal to 400 (i.e. the min. number of shots (in 100s) that do not return garbage values).
And for 4 qubits, we get a range of values from <5 to >2, and 5-2=3, and 40% of 3 equals 1.2, so we shall only consider the values from <5 to >3.8 since 5 - 3.8 = 1.2. But, since we need to keep a leverage of values to possibly at times be >=5 by a small increment, and/or <=2 by a small decrement, therefore we round 1.2 to the nearest 4th of a number (i.e. num, num.25, num.5, num.75, num+1 for num to be the integral part). So, in the code I have considered the values from <5 to >3.75 since 5 - 3.75 = 1.25 (1.2 rounds to 1.25).
Similarly, for 8 qubits, we get a range of values from <10 to >4, and 10-4=6, and 40% of 6 equals 2.4, so we shall only consider the values from <10 to >7.6 since 10 - 7.6 = 2.4. But, since we need to keep a leverage of values to possibly at times be >=10 by a small increment, and/or <=4 by a small decrement, therefore we round 2.5 to the nearest 4th of a number (i.e. num, num.25, num.5, num.75, num+1 for num to be the integral part). So, in the code I have considered the values from <10 to >7.5 since 10 - 7.5 = 2.5 (2.4 rounds to 2.5).
You may also note in the code is that I have discarded values >10, >100, >1000 etc. for the values to range from 10.something to <10.something, 100.something to <100.something, and 1000.something to <1000.something, respectively; as in these cases 10.something, 100.something, and 1000.something, respectively, have an extra significant digit in their integral part as compared to 9.something, 99.something, and 999.something, respectively. This extra significant digit would take us way off the actual minima if we also consider these outlier points, so we discard these for the values to range from 10.something to <10.something, 100.something to <100.something, and 1000.something to <1000.something etc.
Furthermore, note that for the length of range of values to be <=2, the algorithm is to simply take the minimum value obtained as the minima, as it is anticipated that for this length of range of values, the minimum value obtained must be close to the actual minima to give a good approximation.
Also, for the cases in which we have some outlier value(s) of 10.something, 100.something, and 1000.something etc. (i.e. the values to range from 10.something to <10.something, 100.something to <100.something, and 1000.something to <1000.something) and the difference between the integral parts of 10.something and the nearest <10.something to 10 etc. is >=2, it is anticipated that the formula for minimum value obtained might be approximated equal to the average of all the weighted values from the leftmost 40% range of values "with the upper most and lower most <10 etc weighted values removed from the formula". For example, for values that are of the form 10.something, 8.something, 7.something, 6.something, 5.something, 4.something, we shall discard the terms with weights np.random.uniform(0.9, 1) and np.random.uniform(1, 1.1) from the formula, and change the formula to now be just average of all (immediate integer <10).something values.
The formula and random coefficients/weights:
Note that we are considering the leftmost 40% of the range of set of values. Further note that, from the diagram in the presentation, we can see that for the minimum to be the bottom most point in the well, the leftmost 40% covers (40-x)% values left of the minimum and x% values right of the minimum. This x can take a slightly different value in each new run of the algorithm. But it is anticipated that this x is close to approximately 20.
The formula set for the calculation of final cost (minima) is:
minima = np.average([np.random.uniform(0.9 - (j) 0.1, 1 - (j) 0.1) i for i in dic[f_(m-1-j)_max]] + ... + [np.random.uniform(0.9 - (2) 0.1, 1 - (2) 0.1) i for i in dic[f_(m-3)max]] + [np.random.uniform(0.9 - (1) 0.1, 1 - (1) 0.1) * i for i in dic[f(m-2)max]] + [np.random.uniform(0.9, 1) * i for i in dic[f(m-1)_max]] + dic[f_mmax] + [np.random.uniform(0.9 + (1) 0.1, 1 + (1) 0.1) * i for i in dic[f(m+1)max]] + [np.random.uniform(0.9 + (2) 0.1, 1 + (2) 0.1) * i for i in dic[f(m+2)max]] + ... + [np.random.uniform(0.9 + (k) 0.1, 1 + (k) 0.1) * i for i in dic[f(m+k)max] if i < -float(str(int(f(m+k)_max))+".0 or 0.25 or 0.5 or .75")])
where j = k for odd number of total number of brackets of values, and j = k + 1 for even number of total number of brackets of values, m is the index for the median bracket of values, and f_n_max are the nth maximum floor values of the brackets of values.
For example, for 8 qubits, if we get values ranging from <10 to >4, we shall consider the brackets of values <10 to >9, <9 to >8, and <8 to >7.5, with f_1_max = 9, f_2_max = 8, and f_3_max = 7. And here j = k = 1.
Since x can take any value from a range of possible values around 20 for each new run of the algorithm, we randomly take 90-100% weighted values of the bracket of values immediately on the left of the median bracket of values of the leftmost 40% of the length of range of all values, 80-90% weighted values of the bracket of values immediately on the left of this 90-100% weighted bracket of values, ... and so on till we reach the leftmost bracket of values. We also randomly rake 100-110% weighted values of the bracket of values immediately on the right of the median bracket of values of the leftmost 40% of the length of range of all values, 110-120% weighted values of the bracket of values immediately on the right of this 100-110% weighted bracket of values, ... and so on till we reach the rightmost bracket.
Note: By "bracket of values" I mean the values lying between two consecutive integers OR the values lying between the rightmost limit of the 40% of the length of range of all values and the integer immediately left to it.
Response to Questions asked:
Ans) The reason for multiplying random coefficients is that we can't be multiplying a bracket of values by a fixed weight/coefficient as that would just scale all those values by that fixed weight and would not be generalizable. Also, by multiplying with random coefficients that differ by 10%, the bracket of values is weighted/scaled to now cover >=2 times the range they originally covered, giving more overlaps of values with the true minima, resulting in a higher chance of the average of these weighted values to result pretty near the true minima.
Ans) I am not calculating weighted average. I am scaling the numbers on the left side of the median bracket of values (of the leftmost 40% of the length of range of all values) down by 90-100% and the number on the right side of the median bracket of values (of the leftmost 40% of the length of range of all values) up by 100-110%. And then I am adding these weighted values and dividing by the total number of values added (taking average of these scaled/weighted values).
The concept here is to bring down some of the values towards the further left of the true minima near to the true minima (which most likely lies somewhere in the median bracket of values of the leftmost 40% of the length of range of values) by scaling them down randomly by 90-100% here. And bring up some of the values towards the further right of the true minima near to the true minima by scaling them up randomly by 100-110%.
Note: When I use scale up/down or bring up/down .. I am ignoring the negative sign in the values/costs per iteration
Ans) The reason for which I chose the weights to be 90-100% and 100-110% is that 90-100% rescales the values in the bracket of values immediately towards the left of the median bracket of values to range from around rightmost limit of the median bracket to the leftmost limit of the bracket itself; similarly 100-110% rescales the values in the bracket of values immediately towards the right of the median bracket of values to range from around leftmost limit of the median bracket to the rightmost limit of the bracket itself.
Ans) The presentation doesn't say that I am taking a weighted average. It says that I am taking the average of weighted values. I have coded the code to cover all cases upto length of range of values < 7. However, the code is extendable for lengths >= 7 (There is a typo in the comment in the code). And therefore, I have selected all of the values in the leftmost 40% of the length of range of values to incorporate into the formula for calculating the final cost (minima), for atleast the given 4 qubit and 8 qubit Hamiltonians in the code (As explained above with generalized formula before responding to stated questions).
The reason for choosing this particular value of 40% is that for both 4 qubits and 8 qubits, I could see that the minima lied at a distance of around 20% of the length of range of values from the leftmost value. So, since it's true for both 4 qubits and 8 qubits, it can be generalized to any number of qubits, hence the algorithm is scalable/extendable.
Declaration: The parameters set/selected are all based on logic, ranged, and randomized, and are not hardcoded to get a good answer.
Let me know of further confusions/questions too, so I can clarify them, if any.
Best, Javaria
Team name:
Javaria
Team members:
Javaria Ghafoor
Project Description:
To calculate the ground state energy of the given Hamiltonian, the Hartee Fock anastaz is first designed such that the parametric rotation gates and CZ gates are applied only in the first half of the circuit. The estimator is then set up for measuring the expectation value of the Hamiltonian. Shots are allocated for concurrent parameter sampling. These shots are set to be the approx. minimum number of shots that do not result in garbage values. Then, the anti-noise VQE's optimization loop is implemented using AdaBelief optimizer. And the final cost is calculated using a technique to mitigate noise present in each iteration's output value of cost.
Presentation:
https://github.com/JavariaGhafoor/quantum-algorithm-grand-challenge/blob/main/Presentation.pdf
Source code:
https://github.com/JavariaGhafoor/quantum-algorithm-grand-challenge/blob/main/problem/answer.py
Your score (Optional)
got a maximum average score of 3 runs as: 52.91171961