Finalising write-up of the Dirichlet bound numerics and the RH-conditional DBN-results

rudolph-git-acc commented 5 years ago

@teorth @km-git-acc

I am opening a new issue to help us coordinate and drive the efforts required to finalise the write-up. The OP of the 11th thread nicely summarises the topics still to be addressed:

I) Is my understanding correct that the write-up for the Barrier numerics is pretty much complete now or is there more to be done?

II) The write-up for the numerics that need to show that the 'right side of the Barrier' is zero-free is still incomplete. KM has started work on this over Christmas, however not sure how far he has progressed the work (slightly concerned since he seems no longer active on the blog and doesn't respond to e-mail, hopefully he reads this message and chips in again :) ). We do have an idea on how to complete this piece (also based on the polymath15 10th thread OP):

1) we need to ensure the range $N_a \leq N \leq N_b$ (where $Na$ is the value of $N$ corresponding to the barrier $X$) is zero-free through checking that for each $N$, the $ABB{eff}$ lower bound always exceeds the upper bound of the error terms. 2) From theory, two lower bounds are available: the Lemma-bound (eq. 80 in the writeup) and an approximate Triangle bound (eq. 79 in the writeup). Both bounds can be ‘mollified’ by choosing an increasing number of primes (to a certain extent) until the bound is sufficiently positive. 3) The Lemma bound is used to find the number of ‘mollifiers’ required to make the bound positive at $N_a$. For the Barrier location under study, using primes $2,3,5,7$ induced a positive Lemma bound of $0.067$ at the Barrier location $N_a=69098$.
4) The approximate Triangle bound (no mollifier) evaluates faster and has been used to establish an $N_b = 1700000$ beyond which the analytical lower bound takes over. 5) The Lemma-bound is then also used to calculate that for each $N$ in $[N_a, N_b]$, the lower bound stays sufficiently above the error terms. The Lemma bound only needs to be verified for the line segment $y=y_0, t=t_0, N_a \leq N(x) \leq N_b$, since the Lemma bound monotonically increases when $y$ goes to $1$. 6) Explain that the upper bound on the error bound was chosen conservatively (something like eC_0+3 eB <- need to check). 7) To speed up computations of 6), a fast “sawtooth” mechanism has been developed. This only calculates the minimally required incremental Lemma bound terms and only induces a full calculation when the incremental bound goes below a defined threshold:

7a) we wanted a threshold which was sufficiently above the error bounds, and 0.01 provided a reasonable buffer. 7b) for every successive N, if one actually found the bound using the non-sawtooth version, the bound is expected to keep going more positive (since the term in the denominator (t/2)*log(N) keeps increasing), (while the number of summand terms also increase with increasing N, the denominator effect is much larger). Using the sawtooth version, we assume that the summand terms for N+1 are the same as those for N (thus not providing the denominator advantage), and then we subtract the incremental terms (using N+1 in the denominator). Since we are basically constantly subtracting for each successive N, the sawtooth bound keeps decreasing towards the threshold value, and then we recalculate the bound using the non-sawtooth version (which provides a big jump) to start the process again.

(insert math descriptions from OP 10th thread of the sawtooth process here)

Plot of Dirichlet and error bounds:

Proposed text as 'caption' for the figure:

The upper bound per Lemma (x.y) in the left graph has been 'mollified' by the first 4 primes and was recalculated each time it became $\lt 0.01$. It clearly exceeds the upper bound of the error terms shown in the right picture and demonstrates that $f_{0.2}(x+0.2i)$ does not vanish in this range of $N$. The error bound followed the recalculation pattern of the Lemma bound.

III) For the conditional runs up to DBN $\lt 0.1$, we propose something along the lines of: 1) In our quest to further reduce the DBN-constant below $0.22$, i.e. the domain where results will be conditional on RH-verification up to a certain height, we developed a new approach. For a carefully chosen combination of $y_0, t_0$, the Triangle bound N_b (mollified with prime 2) can be made positive already at the location of the Barrier. This implies $N_a=N_b$, hence no further computations are required after the Barrier has been ‘cleared’ from any zeros having passed through it. So we have to find those $x, y_0, t_0$ combinations for which the Triangle bound just becomes (and stays) positive and then to place the Barrier at an optimal location just beyond the derived x-value. (insert curve for all suitable combinations, KM has such a graph). 2) Explain that the Barrier verifications runs are using exactly the same math and algorithms as were used for $y=t=0.2$ (i.e. with the stored sums and a 'Tloop') 3) Elaborate on the results achieved (all winding numbers zero, minmodABB never < 1, etc.). (maybe insert plot how the total number of rectangles to be processed increased for each run and/or how many mesh points needed to be evaluated at $t=0$ for each incremental run). 4) Explain how the accuracy of the results has been assured (e.g. we used 10 digits for $10^{20}, 10^{21}$, 20 digits for the lower Barriers). We did quite a few checks on the mesh point output (mostly upfront when we tested the Boinc scripts esp for $10^{20}$and $10^{21}$).

Does this make sense? Any other data/information to provide?

rudolph-git-acc commented 5 years ago

@teorth @km-git-acc Started on III) and here is where I got to. I'll await feedback before including it with the correct Latex in the write-up.

In this section we collect some numerical results verifying the second two hypotheses of Theorem \ref{ubc-0} for larger values of $X$, and smaller values of $t_0$, than were considered in Section \ref{newup-sec}. This leads to improvements to the bound $\Lambda \leq 0.22$ conditional on the assumption that the Riemann Hypothesis can be numerically verified beyond the height $T \approx 1.2 \times 10^{11}$ used in Section \ref{newup-sec}.

Having the freedom to assume the RH has been verified up to a certain height, allows a Barrier location $X$ to be the outcome of an optimised choice of $y_0, t_0$ where the Triangle bound becomes positive. This implies no further computations are required after the Barrier has been cleared from any zeros having passed through it. To bring the potential Barrier locations within reach of computation, it proves fruitful to mollify the Triangle bound with the first prime as follows:

Include here the derivation of the moll-2 Triangle bound from: http://michaelnielsen.org/polymath1/index.php?title=Estimating_a_sum (Euler2 mollifier in toy model)

With this improved Triangle bound, more feasible Barrier locations and associated $\Lambda$ can now be established. $\Lambda\,\log{10}X$ should be roughly constant, so a small percentage increase in $\log{10}X should lead to a corresponding percentage decrease in $\Lambda$. The following plot illustrates that indeed $X$ has to increase exponentially to achieve smaller and smaller improvements in $\Lambda$.

lambda_vs_x_barrierloc

'caption' :

The curve reflects all potential Barrier locations and their associated $\Lambda$, where the mollified Triangle bound becomes $\geq 0.03$. For this curve, $t_0$ had been made equal to $y_0$. This may not always be the best choice, however should not be too far from the optimal.

A Barrier location can be chosen from the curve and then be further optimised using an offset as explained in Section \ref{newup-sec}. The 'only' computational task remaining is to verify that no complex zeros have flown through the Barrier. For this task, exactly the same math and software scripts could be applied as used for the proof of $\Lambda \geq 0.22$ (Section \ref{newup-sec}).

Our conclusions may be summarised as follows:

insert here the table with the results (in Latex):

The following plots illustrate that for increasing $x$, the number of xy-rectangles to be evaluated within the Barrier as well as the number of mesh points required per rectangle (measured at $t=0$) increases exponentially. mesh_rectangles_at_barrierlocs

'caption' :

The left graph shows how the number mesh points of the xy-rectangle at $t=0$ increases with $x$ for each Barrier. The graph on the right does the same, but now for the total number of xy-rectangles that need to be evaluated per Barrier.

All Barrier runs generated a winding number of zero for each rectangle and the scripts completed successfully without any errors. For all Barrier locations, the computations of the mesh points where calculated at 20 digits accuracy except for the highest two where 10 digits where used (to be able to compute it within a reasonable time). Checks where made before each formal run to assure the target accuracy would be achieved.

The computations for $X=2 \times 10^{20} + 66447$ and $X=9 \times 10^{21} + 70686$ in the above table were massive, and performed using a Boinc based grid computing setup, in which a few hundred volunteers participated. Their contributions can be tracked at http://anthgrid.com/dbnupperbound .

rudolph-git-acc commented 5 years ago

@teorth @km-git-acc Started to work on II) ("right side of the barrier") and wondered whether we could combine (ii) and (iii) described in this paragraph:

screenshot 2019-01-28 at 17 11 26

I believe the line segment N=69098 ... 1.5 mln was originally split into two parts for computational reasons. However, when we started using the ARB-software and had developed the faster "sawtooth" mechanism, the associated speed increase made this split range obsolete and we managed to easily process the entire run with a mollifier of 4 primes. Here is the link to the output (we even went up to N=1.7mln): https://github.com/km-git-acc/dbn_upper_bound/blob/master/output/eulerbounds/lemmasawtooth_bounds_t_0.20_y_0.2_N_69098_1.7mil_moll_2%2C3%2C5%2C7.txt

Would it therefore be ok to combine (ii) and (iii) in the write-up or should we leave it as is?

teorth commented 5 years ago

Rudolph, this all looks great, and thanks for taking the initiative on this. Certainly if the numerics no longer proceed using the subdivision (ii), (iii), (iv) that was in an obsolete version of the computation then we can rearrange it.

I have been busy with several other things in the last few weeks, but should have some time later this week to look at your revisions and make more suggestions.

rudolph-git-acc commented 5 years ago

Great, no hurry.

Since there is quite a bit of additional Latex involved, I decided to include the updates on II) and III) in a temporary debruijn.pdf to make them easier to review. https://drive.google.com/file/d/1JkYv3-lCUaMovPedRvZ5uSCJjqBp7TGx/view?usp=sharing

The proposed write-up on II) starts at the bottom of p49 and the section on III) starts at the top of p62.

km-git-acc commented 5 years ago

@rudolph-git-acc Nice. Thanks for taking the lead. I m going through a somewhat strange phase - a hectic job and then compensating for it at home and weekends, but I will try to chip in whenever I can.

rudolph-git-acc commented 5 years ago

@km-git-acc

Hey KM, thanks for the 'sign of life' and very happy to hear from you :) Please put your prime focus on getting your work/life balance back to normal again! I believe I can manage the write-up work, unless some very specific technical questions come up.

teorth commented 5 years ago

Hi, sorry for being absent for so long - there was a serious personal matter that consumed several months of my time. I have now returned to the task of finishing off this writeup.

There is a somewhat serious issue in the verification of what is now (ii) - the region where N is between 69098 and 1.5 x 10^6 arising from the fact that the Lemma bound is in fact not monotone in y as previously thought (this is a subtle issue coming from the fact that the Euler-mollified coefficients \tilde alpha_n vary in a strange way with respect to y). Morally it should still be that the y=0.2 case should dominate however. I will try to fix this issue (it will probably need an appeal to the argument principle which, morally speaking, reduces matters to verifying the two sides y=0.2, y=1 of the region, and the y=1 region should be very easy) but it is going to be a bit complicated. In the meantime can you supply me with a link to the ARB code used to verify (ii)? I may eventually need to modify it for the fix I had in mind.

Will keep you posted.

rudolph-git-acc commented 5 years ago

Sorry to hear about the personal matter, but great to have you back :)

For updating the code it is probably easiest to update the pari-gp script first: https://github.com/km-git-acc/dbn_upper_bound/blob/master/dbn_upper_bound/pari/abbeff_largex_bounds.txt. This script was replicated in ARB here (that I am happy to update once the new logic is ready): https://github.com/km-git-acc/dbn_upper_bound/blob/master/dbn_upper_bound/arb/LemmaSawtoothFinalv1.c

As an ultimate fallback for the issue in (ii), we could try to verify the RH a bit further than Platt's study. Around T = 3x10^11(see table earlier in this thread) the Triangle bound becomes sufficiently positive for t_0=y_0=0.2 and we would avoid the need for (ii) like we did in the computations conditional on the RH. There actually is an ongoing piece of work to integrate the key RH verification logic in ARB (like Rosser's rule, Turing's method and the Odlyzko–Schönhage algorithm) here: https://github.com/fredrik-johansson/arb/issues/257 (with some involvement from Dave Platt).

teorth commented 5 years ago

OK, I think I have mostly fixed the issue. We have to show that Ht(x+iy) does not vanish in a long rectangle to the right of the barrier. We modify this to E{t,7}(x+iy) H_t(x+iy) / Bt(x+iy) where E{t,7} is the Euler mollifier for primes up to 7. By the argument principle it will suffice to show that this expression stays away not only from zero, but from the negative real axis. It took a little bit of work but the lemma bound that gives a lower bound on the magnitude of this quantity (or more precisely on the Dirichlet series-like approximation to it) also gives a lower bound on the distance to the negative real axis. One does also have to treat the other three sides of the rectangle than just the most important lower side (in which y=0.2); I can handle two of them without difficulty but one of them (the edge on the barrier in which x ~ 6 x 10^10 and y ranges between 0.2 and 1) may need some variant of the calculations already done in the barrier.

I have started to modify the writeup to reflect this new strategy. Right now I am trying to modify the pari-gp code to do the new verification. I think I found a way to proceed that is simpler than the sawtooth method. Namely, I can get a uniform lower bound for the distance of E_{t,7}(x+iy) H_t(x+iy) / Bt(x+iy) that works for all N in a given range [N-, N+], by using N- to lower bound various quantities that show up in the lemma bound but N+ (or more precisely DN+) to determine the number of terms in the summation. My hope is then to cover the range [69098, 1.5 x 10^6] by a small number of intervals [N-, N+] in which this bound gives what one wants; I should be able to do this by hand once I configure the PARI code a little bit to reflect some slight changes in the lemma bound that I made to simplify the expressions somewhat. Will keep you posted.

teorth commented 5 years ago

I have a question. You indicated in your first post in this thread that you used the primes 2,3,5,7 to get a lower bound of 0.067. But it seems to me (after running abbeff_largex_ep_bound(69098,0.2,0.2,"L",5) ) that just using 2,3,5 one can get a lower bound of 0.042 and this might be enough. Do you recall the reason why you went to 7 instead of 5? On my little laptop I have a certain amount of difficulty running the code at 7 but 5 is still manageable.

rudolph-git-acc commented 5 years ago

Good question. Just ran the Lemmabound at t=y=0.2 at N=69098 in ARB and indeed get:

4 primes: 0.0670309236588400192
3 primes: 0.042779349872114395263
2 primes: -0.01038630322171839

The reason for opting for 4 primes was that we wanted the bound to be large enough to: 1) be on the safe side (i.e. way above the error terms upper bound); 2) optimize the sawtooth mechanism (i.e. avoid the number of full recalculations required)

With your new approach, that potentially replaces the sawtooth mechanism, and also knowing that 0.04 is multiple orders of magnitude above the error terms, I believe we should be ok using 3 primes to cover this range.

P.S. I do recall 4 primes made pari/gp struggle a bit, but was not an issue for ARB.

teorth commented 5 years ago

OK, I am close to finalising the verification to the right of the barrier (Section 8.5 in the newest version of the writeup). One has to show that E_{t,5}(x+iy) H_t(x+iy) / B_t(x+iy) avoids the negative real axis for a certain long rectangle R = { X_0 - 0.5 < x; 0.2 <= y <= 1; N \leq 1.5 x 10^6 } with t = 0.2. I can do this for three of the sides of the rectangle, the one missing thing is the left side x = X_0 - 0.5, 0.2 <= y <= 1 with t=0.2. This however should follow from the analogue of Figure 16, with t=0.2 instead of t=0. Would it be possible to draw this figure? One just needs f_t(x+iy) to have magnitude well away from zero (e.g. larger than 0.1 will be fine) and stay well away from the negative axis (e.g. having an argument of at most 2.58 radians in magnitude would be fine). If the t=0.2 picture is anything like the t=0 picture of Figure 16 then this should be true with lots of room to spare.

p.s. I adapted the PARI file (the new file is called new_abeff_largexbounds.txt in the pari directory) to verify that E{t,5}(x+iy) H_t(x+iy) / B_t(x+iy) avoids the negative real axis in a range of the form t = 0.2, y = 0.2, N1 <= N <= N2; as long as the output of abbeff_largex_ep_bound2(mtype,N1,N2,y,t) exceeds a threshold (0.00205, as it turns out, for this choice of parameters). For instance abbeff_largex_ep_bound2(5,69098,80000,0.2,0.2) = 0.026... so this clears the region 69098 <= N <= 80000. I was able to cover the region 69098 <= N <= 1.5 * 10^6 by the five intervals [69098, 8 \times 10^4], [8 \times 10^4, 1 \times 10^5], [1 \times 10^5, 2 \times 10^5], [2 \times 10^5, 1 \times 10^6]$, $[1 \times 10^6, 1.5 \times 10^6]; it may be possible to do it with fewer (there is a lot of room to spare) but I wasn't being very efficient.

The output here is pretty close to the output from the old abbeff_largex_ep_bound data so I would imagine that all the conditional results you've obtained would also go through. Unfortunately one also now has to do some verifications on the other three sides of the rectangle but they should be doable with a lot of room to spare (and maybe we don't need to do them if we just want to give a tentative indication of what conditional results ought to be possible with our approach).

rudolph-git-acc commented 5 years ago

Great. Here is the plot for t=0.2 analogous to figure 16. The number of mesh points required for a rectangle at this height of t is only 52.

Polygonplott02

Data used: Polygonplotmeshpointst02.txt

The new approach looks like a very nice simplification. As a next step I will try to replicate it in ARB.

P.S. In the comments of the 11th thread, there are two small updates to the write-up proposed that we shouldn't forget to include: https://terrytao.wordpress.com/2018/12/28/polymath-15-eleventh-thread-writing-up-the-results-and-exploring-negative-t/#comment-510681

teorth commented 5 years ago

Thanks for this! Interesting that some of the edges have very wide spacing but I guess this comes from the derivative bound getting considerably better when t and y are large. I've added this to the writeup (and the preceding changes have also been implemented).

Actually we now look very close to being able to finalise the writeup. I have just one question about the final table. The final column is referring to the "Moll2 bound", does this mean the Euler mollifier just using the prime 2 and nothing else, combined with what we called the "Lemma bound" (and has now become Lemma 8.5 in the latest writeup)?

rudolph-git-acc commented 5 years ago

The "Moll2 bound" refers to the (fast) Triangle bound and not to the Lemma bound. Last year you posted the derivation of a prime 2 Triangle bound on the Wiki page, that enabled us to considerably lower the heights of the range end points (N_1). And this then led to the idea that for certain combinations of t_0, y_0 and a free choice of x (assuming RH up to that point), we could select Barrier locations where N_1 was already sufficiently positive and no further verification work beyond the Barrier would be required.

P.S. The new lower lemma bounds are now also working in ARB. Couldn't resist trying to optimise the intervals and as you predicted it could indeed be done in less steps:

 [69,098 - 80,000]     = 0.0263417
 [80,000 - 110,000]    = 0.0470363
 [110,000 - 220,000]   = 0.0930734
 [220,000 - 1,500,000] = 0.0604905

rudolph-git-acc commented 5 years ago

The write-up starts to really look good :)

Some more info the RH-conditional computations: In the second and fifth posts of this thread, I proposed some draft language to include in the final section of the write-up (starting after "...violations of the Riemann hypothesis at height T."). This text was also transcribed into a temporary Latex version: LatexConditionalResults.txt that includes a copy of the derivation of the prime 2 Triangle bound (from Wiki http://michaelnielsen.org/polymath1/index.php?title=Estimating_a_sum), the results table and the two new graphs below. This still needs to be reviewed.

Lambda_vs_x_barrierloc mesh_rectangles_at_barrierlocs

Just to be sure: is it actually correct to assume that an analytical proof of non-vanishing will always exist beyond the point at which a prime 2 (i.e. mollified) Triangle bound has become sufficiently positive?

teorth commented 5 years ago

Thanks for this! I have now merged in the figures and the text accordingly, and I think the writeup is now nearing a final form.

It is technically true that in order to properly clear out the region to the right of the barrier, it's not quite enough to do the calculation at N_0; one should start clearing out intervals as in what we now do in Section 8.5, then do an analytic calculation past some extremely large cutoff N_1, but given how the bounds always get easier as N increases, and the conservative safety margin we have in the lower bound for f_t, this should not be a problem and I am not really concerned about verifying this carefully (though it would not be too difficult to do if, say, a referee insists on it).

rudolph-git-acc commented 5 years ago

Great! I went through the write-up one more time and spotted these typo's:

1) p10 mid of page: "the zeroes of Ht behaves" -> "the zeroes of Ht behave" 2) p21 formula below "simplifying, we conclude that" has twice the 'X' (times symbol) 3) p48 top of page: "x = 6 × 10^10 + 83925" -> "x = 6 × 10^10 + 83952" 4) p64 mid of page: "beyond the height T ≈ 1.2 × 10^11 used in Section 8" -> "beyond the height T ≈ 3.06 × 10^10 used in Section 8"

teorth commented 5 years ago

Thanks! I've now implemented these corrections.

rudolph-git-acc commented 5 years ago

In the spirit of wrapping up the Polymath15 project, I remembered there was an outstanding action in the "Sharkfin" document ( https://github.com/km-git-acc/dbn_upper_bound/tree/master/Writeup/Sharkfin) to describe the numerical experiments we had done in the negative t domain.

I have now proposed a short description in the document (on p2/3). Grateful if you could have a quick look on whether this meets your needs.

P.S. I recall that one of your grads was interested in trying to make the heuristic approximations about the negative t domain more rigorous. Just out of interest, did that work indeed kick off?

teorth commented 5 years ago

Thanks for this! Looks great, and I have added a ref to it from the main writeup.

My student was able to justify the curves of complex zeroes for large negative values of t (precisely how negative depends on which curve one wants to justify), but something funny happens for small negative values of t in which the zeroes still peel off but are distributed in a much less ordered way than along a discrete family of curves. There seems to be some interesting mathematics going on here that is not yet fully explained.

km-git-acc commented 3 years ago

Strange to think it's been 2 years..time flies by too quickly.

Stating briefly, I finally started my own business in 2019. Turns out to be much tougher than one hopes for. But happy to say it can ride tricycles now, and hopefully soon bicycles too.

I have resisted 'math temptations' during this time, but since the last month or two, have found myself yielding to them again. Hopefully, there are more such projects and opportunities to contribute in the future.

km-git-acc / dbn_upper_bound

Finalising write-up of the Dirichlet bound numerics and the RH-conditional DBN-results #131