LMFDB / lmfdb

L-Functions and Modular Forms Database
Other
246 stars 199 forks source link

Fix errors in classical modular form data and verify its correctness #1248

Closed AndrewVSutherland closed 5 years ago

AndrewVSutherland commented 8 years ago

In the thread for critical issue https://github.com/LMFDB/lmfdb/issues/1223 (errors in modular form data), it was suggested that we should hide the spaces with non-trivial character. This has not yet been done, and there are still many spaces with non-trivial character that have some but not all Hecke orbits missing. For example:

http://lmfdb.warwick.ac.uk/ModularForm/GL2/Q/holomorphic/11/11/10/ http://lmfdb.warwick.ac.uk/ModularForm/GL2/Q/holomorphic/11/13/10/ http://lmfdb.warwick.ac.uk/ModularForm/GL2/Q/holomorphic/3/97/2/

and many more (I will post an update version of emf_db_stats.py shortly that finds all these (now in merged pull request https://github.com/LMFDB/lmfdb/pull/1249).

We need to decide (within the next 12 hours!) whether to hide these spaces or not.

sehlen commented 8 years ago

Thanks, John!

Memory does not seem to be an issue right now. I’m still writing the file, I guess it’s going to be several Gigabytes and I hope that it does write the file. I’m on a machine where I have 94GB RAM and 16 Intel(R) Xeon(R) CPU X5570@ 2.93GHz (those might be 8 in fact but it doesn’t matter), which should not be much worse than what we have on lehner (more cores on lehner but they don’t help with a single computation). Anyway, it just takes time ;-)

Once we have a good interface to magma for our webmodforms, I might use it or the 100+ cores at MIT that Drew advertised (I never had more CPU’s at my disposal ;-) but that’s not going to happen this week.

I’ll take Bober’s degree 800 example (and some simpler cases so that I don’t have to wait a week for reads and writes) as a test case how I have to adjust the classes to make sure we can use the data from Magma.

On May 18, 2016, at 14:55, John Cremona notifications@github.com wrote:

Stephan, feel free to use atkin or lehner which have magma and 64g each.

On 18 May 2016 at 18:04, Stephan Ehlen notifications@github.com wrote:

@jwbober https://github.com/jwbober Actually, Magma is better than I thought. Using not too much Ram (top shows 5.6g) it has been able to compute this space and seems to have taken around maybe 1.5 days (I didn't time it and there are so many commands I don't know in Magma ;-). Anyway, I'm saving it to a file now and come back to you once I know the minimal polynomial of a(2). I already printed it but on the stupid machine that I'm on, I only have screen and my buffer is set too small to scroll up to the beginning ;-)

But this shows that interfacing Magma for the lmfdb will be very helpful (I started sage around the same time - it is still running but the machine is also slower, I think. It uses about 10g virtual memory but the resident memory is actually not too bad with around 4.6g). I'll let you know if it terminates.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-220093054

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-220123874

AndrewVSutherland commented 8 years ago

On 2016-05-18 15:07, Stephan Ehlen wrote:

Thanks, John!

Memory does not seem to be an issue right now. I’m still writing the file, I guess it’s going to be several Gigabytes and I hope that it does write the file. I’m on a machine where I have 94GB RAM and 16 Intel(R) Xeon(R) CPU X5570@ 2.93GHz (those might be 8 in fact but it doesn’t matter), which should not be much worse than what we have on lehner (more cores on lehner but they don’t help with a single computation). Anyway, it just takes time ;-)

Once we have a good interface to magma for our webmodforms, I might use it or the 100+ cores at MIT that Drew advertised (I never had more CPU’s at my disposal ;-) but that’s not going to happen this week.

Regarding the MIT machines, I have a 48 core machine with 256GB RAM and a 64 core machine with 1.1TB RAM; I'm running some of the polredabs computations on the now, but just let me know when you are ready to use them and they will be at your disposal (they have Sage 7.1 and Magma version V2.21-11 on them).

I’ll take Bober’s degree 800 example (and some simpler cases so that I don’t have to wait a week for reads and writes) as a test case how I have to adjust the classes to make sure we can use the data from Magma.

On May 18, 2016, at 14:55, John Cremona notifications@github.com wrote:

Stephan, feel free to use atkin or lehner which have magma and 64g each.

On 18 May 2016 at 18:04, Stephan Ehlen notifications@github.com wrote:

@jwbober https://github.com/jwbober Actually, Magma is better than I thought. Using not too much Ram (top shows 5.6g) it has been able to compute this space and seems to have taken around maybe 1.5 days (I didn't time it and there are so many commands I don't know in Magma ;-). Anyway, I'm saving it to a file now and come back to you once I know the minimal polynomial of a(2). I already printed it but on the stupid machine that I'm on, I only have screen and my buffer is set too small to scroll up to the beginning ;-)

But this shows that interfacing Magma for the lmfdb will be very helpful (I started sage around the same time - it is still running but the machine is also slower, I think. It uses about 10g virtual memory but the resident memory is actually not too bad with around 4.6g). I'll let you know if it terminates.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-220093054

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-220123874


You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-220127253

AndrewVSutherland commented 8 years ago

The page http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/83/12/1/?group=0 claims there are no newforms in this space, but according to http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/ranges/?level=80-100&weight=2-12&group=0 and Magma:

Dimension(NewSubspace(CuspidalSubspace(ModularForms(83,12))));

it has dimension 74.

jwbober commented 8 years ago

I've now computed the Galois orbit structure and coefficient fields for all levels < 100 and weight < 13, and all characters. (And I also have scattered data for 100 <= level < 371).

This took longer than it should have because I've been doing things in a rather ad hoc manner. (These computations are still in an "experimental" phase, and I actually never even planned on fully identifying so many spaces.)

See http://www.maths.bris.ac.uk/~jb12407/newform-fields/ORBIT_STRUCTURE for a succinct description of the orbit structure. For an extreme example, see http://www.maths.bris.ac.uk/~jb12407/newform-fields/newformfields.83.12, where for the nontrivial character the minimal polynomial of a(2) is degree 3040 and has 8239301 digits.

(Disclaimer/warning: I've barely looked at this data to make sure that it isn't obviously full of errors.)

JohnCremona commented 8 years ago

There is a comment I wanted to make on this or one of the similar threads (where people were trying to polredabs the polynomials defining the Hecke fields):

We are computing more of these Hecke fields than anyone ever has before (surely?) so we could try to see the prevalence of huge discriminants as a feature rather than a bug, i.e. it is an interesting mathematical observation worthy of being considered data in itself that these fields are what they are. Any number field which is the Hecke field of a classical modular form of level <100 and weight <20 (or whatever) can be regarded as one which "arises naturally in nature".

Just a thought.

AndrewVSutherland commented 8 years ago

@JohnCremona I agree 100% (also see https://github.com/LMFDB/lmfdb/issues/1386), and I think it could be very useful (once we have dealt with more urgent issues) to try to add any invariants of the coefficient fields that can be feasibly computed (e.g. ramification data at small primes, low degree subfields), as this information could be of interest completely independent of the q-expansions, and I can imagine people wanting to search on these. This strikes me as an area where there might be new conjectures to be made and/or theorems to be proved based on data in the LMFDB.

This raises two more general questions that are not specific to modular forms:

  1. Should there be a general mechanism for for storing "arithmetically interesting" number fields related to objects in the LMFDB that are outside (maybe far outside) the range of the data in numberfields.fields?
  2. What information can be feasibly extracted about a number field given only a (potentially very ugly) defining polynomial whose discriminant might be so large we cannot even factor it?

Perhaps @jwj61 has an opinion about these questions. It might be worth opening a new thread for this discussion to avoid hijacking this issue (which is still tagged as critical and quite urgent!).

AndrewVSutherland commented 8 years ago

@jwbober Thank you for the data! As soon as I wrap up the PR I am currently working on I will take a look. I have data computed in magma (for a much smaller range, obviously) that I plan to compare to the data in modularforms2. One other question: would it be possible (at least within the ranges (k,N) in [2,12] x [1,50] and [2,20] x [1,16] current listed in http://beta.lmfdb.org/knowledge/show/mf.elliptic.extent, but further if feasible) to get you to compute a list of the traces of, say, the first 100 Hecke eigenvalues? Sorting these will let us match up orbit labels and comparing them to the corresponding values obtained from the q-expansions stored in modularforms2 would be a great sanity check.

AndrewVSutherland commented 8 years ago

As reported by @jvoight (see https://github.com/LMFDB/lmfdb/issues/1442#issue-156183423), the page http://lmfdb.org/ModularForm/GL2/Q/holomorphic/32/2/29/ claims that this space is empty but also conjugate to the evidently non-empty space http://lmfdb.org/ModularForm/GL2/Q/holomorphic/32/2/5/. I will add a test to check for these kinds of errors to the test script https://github.com/LMFDB/lmfdb/blob/master/lmfdb/modular_forms/elliptic_modular_forms/emf_test_pages.py).

jwbober commented 8 years ago

I just posted a list of the first 50 traces for everything I've got at http://www.maths.bris.ac.uk/~jb12407/newform-fields/NEWFORM_TRACES. (I don't know why I chose 50 instead of 100...) Actually, nope, I did for(k = 1; k < 50; k++), so there are 49 traces...

The format is a newform [orbit] per line. On each line there is a list of labels, level.weight.character.number, then a semicolon, then a list of (49) integers. Those forms form a Galois orbit, and those are the traces of the first 49 Fourier coefficients.

It was good that I did this, because the data that I posted yesterday was somewhat garbage. (The polynomials should all be correct, as far as I know, but I completely screwed up when listing which forms go with which polynomial. Seeing now that all of the traces are integers gives me some confidence.)

I think I should be able to do some higher weight and small level.

AndrewVSutherland commented 8 years ago

@jwbober Wonderful, thank you. I have done similar computations using Magma for all the spaces that fall with the extent we currently list on http://beta.lmfdb.org/knowledge/show/mf.elliptic.extent. Tomorrow I will start working on matching all 3 lists (mine, yours, and what is currently in the LMFDB).

sehlen commented 8 years ago

Please wait matching with the LMFDB. I screwed up a few spaces on Saturday when I started to "fix" them but was traveling for the last two days. I will take care of them tomorrow (the problem I caused has to do with the Galois conjugates) and maybe we should start doing the matching after the fixing is done (by replacing old records).

On Mon, May 23, 2016 at 9:06 PM Andrew Sutherland notifications@github.com wrote:

@jwbober https://github.com/jwbober Wonderful, thank you. I have done similar computations using Magma for all the spaces that fall with the extent we currently list on http://beta.lmfdb.org/knowledge/show/mf.elliptic.extent. Tomorrow I will start working on matching all 3 lists (mine, yours, and what is currently in the LMFDB).

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-221139629

AndrewVSutherland commented 8 years ago

@sehlen, no problem, I'll focus on matching with @jwbober's list first (but presumably it is ok to check the newforms in the LMFDB with trivial character, we believe these are all now correct, right?)

On 2016-05-23 21:13, Stephan Ehlen wrote:

Please wait matching with the LMFDB. I screwed up a few spaces on Saturday when I started to "fix" them but was traveling for the last two days. I will take care of them tomorrow (the problem I caused has to do with the Galois conjugates) and maybe we should start doing the matching after the fixing is done (by replacing old records).

On Mon, May 23, 2016 at 9:06 PM Andrew Sutherland notifications@github.com wrote:

@jwbober https://github.com/jwbober Wonderful, thank you. I have done similar computations using Magma for all the spaces that fall with the extent we currently list on http://beta.lmfdb.org/knowledge/show/mf.elliptic.extent. Tomorrow I will start working on matching all 3 lists (mine, yours, and what is currently in the LMFDB).

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-221139629


You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-221140618

jwbober commented 8 years ago

I just started a computation for all characters in (k, N) in [13,30] x [1,30]. I expect it will finish in a few hours, or at least by tomorrow evening. But I won't have time to look at the results until tomorrow evening anyway.

(In the next few weeks I'll clean up my code and get more organized, and then start these computations over "for real". I was just looking at the precision where I've computed for levels 100 though 400, and weights 2 through 4, and it might be feasible to identify the Galois orbits for all of those spaces. Also, I have some ideas about working mod p to get the characteristic polynomials, which might work better, but that would be a bit more work. And maybe at some point I'll run into trouble factoring the polynomials, even if I can write the polynomial down.)

On Tue, May 24, 2016 at 2:13 AM, Stephan Ehlen notifications@github.com wrote:

Please wait matching with the LMFDB. I screwed up a few spaces on Saturday when I started to "fix" them but was traveling for the last two days. I will take care of them tomorrow (the problem I caused has to do with the Galois conjugates) and maybe we should start doing the matching after the fixing is done (by replacing old records).

On Mon, May 23, 2016 at 9:06 PM Andrew Sutherland < notifications@github.com> wrote:

@jwbober https://github.com/jwbober Wonderful, thank you. I have done similar computations using Magma for all the spaces that fall with the extent we currently list on http://beta.lmfdb.org/knowledge/show/mf.elliptic.extent. Tomorrow I will start working on matching all 3 lists (mine, yours, and what is currently in the LMFDB).

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-221139629

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-221140618

JohnCremona commented 8 years ago

Thanks to all three of you (at least) for doing this so systematically. When we (you) are done I think we should document in the appropriate data quality knowl that this data has been independently computed in Sage, Magma and JB's C code and checked for consistency, both internally and between the three sets of computations. I wish we could have such good independent checks for all the data we have!

sehlen commented 8 years ago

@AndrewVSutherland The issue reported by @jvoight https://github.com/LMFDB/lmfdb/issues/1442#issue-156183423 was actually caused by my fix of pages like /5/4/4/ ending up being empty in my pull request last week. I'm sorry... I saw the issue this weekend myself but didn't have much time so I just put it quickly on my personal todo list - it is fixed now in https://github.com/LMFDB/lmfdb/pull/1453 and I hope I didn't make another mistake....

AndrewVSutherland commented 8 years ago

@sehlen, great, I'll take a look at the new pull request either tonight or tomorrow morning.

On 2016-05-24 16:09, Stephan Ehlen wrote:

@AndrewVSutherland The issue reported by @jvoight https://github.com/LMFDB/lmfdb/issues/1442#issue-156183423 was actually caused by my fix of pages like /5/4/4/ ending up being empty in my pull request last week. I'm sorry... I saw the issue this weekend myself but didn't have much time so I just put it quickly on my personal todo list - it is fixed now in https://github.com/LMFDB/lmfdb/pull/1453 and I hope I didn't make another mistake....


You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-221379087

AndrewVSutherland commented 8 years ago

@sehlen, I'm testing PR #1453 but I keep bumping into other problems (so far none obviously related to changes in the PR so I am reporting them elsewhere). I get a problem with Hecke orbits in the database error message for http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/32/2/21/ (and also on beta).

sehlen commented 8 years ago

Well, of course since the data is not fixed. There will be many more. I'm still preparing fixing all of the data but keep running into smaller issues.

I'm commenting on the first one you reported at https://github.com/LMFDB/lmfdb/issues/1454 and this one here is something I'm currently on to. I saw that the dimension_table collection sometimes contains multiple entries when it should really just contain one for a Galois orbit leading to, for instance, multiple entries for the Galois orbit [5,13,21,29] on the page http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/32/2/ and this leads to a problem on the page for 21 which thinks it should be completely in the db although it should in fact link to the page for 5, which is the Galois conjugate we have in the db.

I'm running a first update on the database records right now which prepares a second one. It'll take a while because I have to touch all records and maybe I have to look at what you guys found out about indices and large updates in such a context but it should be done tomorrow (the second one which will eliminate the duplicate records will be fast. The running update adds a missing entry (space_orbit_label pointing to the representative) to many records, I have no idea who created all these when but anyway, I'm working on it.)

On Tue, May 24, 2016 at 9:46 PM Andrew Sutherland notifications@github.com wrote:

@sehlen https://github.com/sehlen, I'm testing PR #1453 https://github.com/LMFDB/lmfdb/pull/1453 but I keep bumping into other problems (so far none obviously related to changes in the PR so I am reporting them elsewhere). I get a problem with Hecke orbits in the database error message for http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/32/2/21/ (and also on beta).

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-221451972

jwbober commented 8 years ago

I've just updated http://www.maths.bris.ac.uk/~jb12407/newform-fields/ with all (k, N) in [2,30] x [2,30], and the NEWFORM_TRACES file now contains first 99 traces. Also, the file is actually sorted now.

oops. I meant to do 100 this time, and I'm running it again just to do 100 instead of 99, but it takes a few hours because it's I/O bound. (Every newform is stored in a different file, so it has to read the beginning of a few million files.)

On Wed, May 25, 2016 at 2:46 AM, Andrew Sutherland <notifications@github.com

wrote:

@sehlen https://github.com/sehlen, I'm testing PR #1453 https://github.com/LMFDB/lmfdb/pull/1453 but I keep bumping into other problems (so far none obviously related to changes in the PR so I am reporting them elsewhere). I get a problem with Hecke orbits in the database error message for http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/32/2/21/ (and also on beta).

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-221451972

AndrewVSutherland commented 8 years ago

@sehlen I understand (and expected) that the problems are likely due to outstanding issues in the data (although I'm surprised by the behavior in https://github.com/LMFDB/lmfdb/issues/1454, which just appears to hang), I'm reporting them in case it is helpful to you.

@jwbober You are a madman (but one after my own heart). There is certainly no need to go to 100 rather than 99, but proceed as you wish :).

AndrewVSutherland commented 5 years ago

Addressed by #2717. See http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/Reliability.