LMFDB / lmfdb

L-Functions and Modular Forms Database
Other
246 stars 199 forks source link

Fix errors in classical modular form data and verify its correctness #1248

Closed AndrewVSutherland closed 5 years ago

AndrewVSutherland commented 8 years ago

In the thread for critical issue https://github.com/LMFDB/lmfdb/issues/1223 (errors in modular form data), it was suggested that we should hide the spaces with non-trivial character. This has not yet been done, and there are still many spaces with non-trivial character that have some but not all Hecke orbits missing. For example:

http://lmfdb.warwick.ac.uk/ModularForm/GL2/Q/holomorphic/11/11/10/ http://lmfdb.warwick.ac.uk/ModularForm/GL2/Q/holomorphic/11/13/10/ http://lmfdb.warwick.ac.uk/ModularForm/GL2/Q/holomorphic/3/97/2/

and many more (I will post an update version of emf_db_stats.py shortly that finds all these (now in merged pull request https://github.com/LMFDB/lmfdb/pull/1249).

We need to decide (within the next 12 hours!) whether to hide these spaces or not.

fredstro commented 8 years ago

Exactly how is it checked? Are you comparing each embedding to a specific embedding (as Sage gives now — which will give an error if the field is just isomorphic but not identical) Or do you try to identify each embedding by comparing different coefficients? Or is it simply that none of your embeddings match an embedding of our coefficient?

On 10 May 2016, at 12:24, Jonathan Bober notifications@github.com wrote:

Here is a list of 958 errors for level < 100. (Something like 400 for level < 50), with all the coefficients checked:

errors_1_99_new.txt https://github.com/LMFDB/lmfdb/files/257037/errors_1_99_new.txt — You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-218130071

jwbober commented 8 years ago

On Tue, May 10, 2016 at 12:25 PM, Fredrik Strömberg < notifications@github.com> wrote:

I don’t think it is an error and the coefficient field is actually correct!

Hmm, ok. I was giving the minimal polynomial of a2, which doesn't necessarily need to be the same as the polynomial you're giving for the number field (which explains why the coefficient of a2 isn't alpha.) When I only use prime coefficients, that form looks fine.

jwbober commented 8 years ago

Here is a much smaller list of errors if I just do the comparison on a(p). For the first form on the list, it seems another case of incorrect character number.

15.10.4.a 15.12.4.a 15.6.4.a 15.8.4.a 15.9.11.a 17.10.13.a 17.6.13.a 17.8.13.a 20.10.9.a 20.12.9.a 20.4.9.a 20.6.9.a 20.8.9.a 20.9.11.a 21.11.13.a 21.11.8.a 21.5.13.a 21.7.13.a 21.7.8.a 21.9.13.a 55.2.2.a 55.2.32.a 55.2.32.b 55.2.4.a 61.2.3.a 61.2.4.a 63.2.37.a 63.2.37.b 63.2.62.a 64.9.31.b 77.2.24.a 96.7.31.a 96.7.79.a 99.3.56.a

I'm still not finished yet, because it takes a while to compute the embeddings. (I'm just using 300 bits of precision instead of 53. Maybe that's not always enough, but it seems pretty good.)

fredstro commented 8 years ago

The easiest is probably to construct the number field of both and just call ‘is_isomorphic’ Although this might be quite time-consuming and there are probably some pari function which does this very efficiently. Or you can make a list of “definitely error” by comparing some suitable invariants (well ,I can only think of the discriminant but I am sure other people know others). I also haven’t be able to find a function which realises the isomorphism which would be good in order to compute further coefficients.

On 10 May 2016, at 12:29, Jonathan Bober notifications@github.com wrote:

On Tue, May 10, 2016 at 12:25 PM, Fredrik Strömberg < notifications@github.com> wrote:

I don’t think it is an error and the coefficient field is actually correct!

Hmm, ok. I was giving the minimal polynomial of a2, which doesn't necessarily need to be the same as the polynomial you're giving for the number field (which explains why the coefficient of a2 isn't alpha.) When I only use prime coefficients, that form looks fine. — You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-218130973

jwbober commented 8 years ago

On Tue, May 10, 2016 at 12:35 PM, Fredrik Strömberg < notifications@github.com> wrote:

The easiest is probably to construct the number field of both and just call ‘is_isomorphic’ Although this might be quite time-consuming and there are probably some pari function which does this very efficiently. Or you can make a list of “definitely error” by comparing some suitable invariants (well ,I can only think of the discriminant but I am sure other people know others). I also haven’t be able to find a function which realises the isomorphism which would be good in order to compute further coefficients.

I'm not actually comparing by comparing the number field. That was just a check I did (wrong) by hand after the automated check that compared embeddings.

fredstro commented 8 years ago

I see, but how do you compare the embeddings?

On 10 May 2016, at 12:38, Jonathan Bober notifications@github.com wrote:

On Tue, May 10, 2016 at 12:35 PM, Fredrik Strömberg < notifications@github.com> wrote:

The easiest is probably to construct the number field of both and just call ‘is_isomorphic’ Although this might be quite time-consuming and there are probably some pari function which does this very efficiently. Or you can make a list of “definitely error” by comparing some suitable invariants (well ,I can only think of the discriminant but I am sure other people know others). I also haven’t be able to find a function which realises the isomorphism which would be good in order to compute further coefficients.

I'm not actually comparing by comparing the number field. That was just a check I did (wrong) by hand after the automated check that compared embeddings. — You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-218132646

JohnCremona commented 8 years ago

I replaced that tunnel from lehner and hope it will work better, sorry.

On 10 May 2016 at 04:11, Fredrik Strömberg notifications@github.com wrote:

I see now, it works fine from home so something must be wrong with that tunnel. Unfortunately since it is there I can’t open a new tunnel, which means I can’t really do any computations there at the moment…

On 10 May 2016, at 12:07, AndrewVSutherland notifications@github.com wrote:

Which mongodb connection? I'm not having any problems connecting to the mongodb on Warwick.

On 2016-05-10 07:03, Fredrik Strömberg wrote:

I will have a look at it as soon as the mongodb connection is working again…

Fredrik

On 10 May 2016, at 11:47, Jonathan Bober notifications@github.com wrote:

There are still a lot of problems. I've been computing the embeddings to higher precision, and seem to be able to run more reliable comparisons now. For example, I missed http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/18/7/11/a/ before.

That does not have the right number field (and I don't know what would have the right number field. Doesn't look like character misnumbering). (Should be

x^12+42_x^11-165_x^10-71370_x^9-1433985_x^8+42645216_x^7+2438352250_x^6+19378764480_x^5-819344025249_x^4-21266453099898_x^3+155340226123755_x^2+10059647233716762*x+157044341640420289.)

I will post a list in a little while. I have almost 1000 forms with N < 100, and 400 with N < 50.

On Tue, May 10, 2016 at 4:34 AM, Stephan Ehlen notifications@github.com wrote:

The space @AndrewVSutherland https://github.com/AndrewVSutherland have not been recomputed by now but I made sure that the 5 above have no link leading to them at least.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-218050902

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-218122831


You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-218126197 — You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub < https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-218126896>

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-218127592

jwbober commented 8 years ago

On Tue, May 10, 2016 at 12:29 PM, Jonathan Bober jwbober@gmail.com wrote:

On Tue, May 10, 2016 at 12:25 PM, Fredrik Strömberg < notifications@github.com> wrote:

I don’t think it is an error and the coefficient field is actually correct!

Hmm, ok. I was giving the minimal polynomial of a2, which doesn't necessarily need to be the same as the polynomial you're giving for the number field (which explains why the coefficient of a2 isn't alpha.) When I only use prime coefficients, that form looks fine.

It took me a while to find the problem by hand, but A[25] is wrong for that form. (I'm going to stop doubting myself now.)

fredstro commented 8 years ago

Ok, it is just a bit hard to know since we don’t see what you are seeing :) So this was just again another problem coming from the recursive formula.

Could you perhaps upload your forms to the database so we can all run your checks? Since I am going through your lists (in particular the wrong a(p)’s have to be looked at individually it would be good to be able to see how the fixes progress.

At some time soon day all forms should simply be redone to make sure but especially now that we try to do the embeddings properly it can take up to a day to recompute a single form…

On 10 May 2016, at 12:57, Jonathan Bober notifications@github.com wrote:

On Tue, May 10, 2016 at 12:29 PM, Jonathan Bober jwbober@gmail.com wrote:

On Tue, May 10, 2016 at 12:25 PM, Fredrik Strömberg < notifications@github.com> wrote:

I don’t think it is an error and the coefficient field is actually correct!

Hmm, ok. I was giving the minimal polynomial of a2, which doesn't necessarily need to be the same as the polynomial you're giving for the number field (which explains why the coefficient of a2 isn't alpha.) When I only use prime coefficients, that form looks fine.

It took me a while to find the problem by hand, but A[25] is wrong for that form. (I'm going to stop doubting myself now.) — You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-218136245

jwbober commented 8 years ago

On Tue, May 10, 2016 at 1:13 PM, Fredrik Strömberg <notifications@github.com

wrote:

Ok, it is just a bit hard to know since we don’t see what you are seeing :) So this was just again another problem coming from the recursive formula.

Could you perhaps upload your forms to the database so we can all run your checks? Since I am going through your lists (in particular the wrong a(p)’s have to be looked at individually it would be good to be able to see how the fixes progress.

I don't want to pollute the database more, but maybe I can produce a simple text file with lists of the coefficients that I have for everything that should be in the LMFDB right now.

I have something like 4 to 5 TB of data stored in an awkward format. I'm not trying to hide it, but it also isn't very easy to share right now.

Eventually, we should figure out how to fill in the blanks in the LMFDB with "embedded modular forms" (or whatever they should be called) when we have embeddings of all of the modular forms but not the algebraic information. (This all started because I just wanted to match the LMFDB data with what I had so that I could put in modular form L-functions.)

fredstro commented 8 years ago

That’s alright. I added a testing function (probably very slow…) which compares the coefficients against what sage gives from the stored subspace. which is able to test the forms you found at least. Fredrik

On 10 May 2016, at 13:53, Jonathan Bober notifications@github.com wrote:

On Tue, May 10, 2016 at 1:13 PM, Fredrik Strömberg <notifications@github.com

wrote:

Ok, it is just a bit hard to know since we don’t see what you are seeing :) So this was just again another problem coming from the recursive formula.

Could you perhaps upload your forms to the database so we can all run your checks? Since I am going through your lists (in particular the wrong a(p)’s have to be looked at individually it would be good to be able to see how the fixes progress.

I don't want to pollute the database more, but maybe I can produce a simple text file with lists of the coefficients that I have for everything that should be in the LMFDB right now.

I have something like 4 to 5 TB of data stored in an awkward format. I'm not trying to hide it, but it also isn't very easy to share right now.

Eventually, we should figure out how to fill in the blanks in the LMFDB with "embedded modular forms" (or whatever they should be called) when we have embeddings of all of the modular forms but not the algebraic information. (This all started because I just wanted to match the LMFDB data with what I had so that I could put in modular form L-functions.) — You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-218148061

jwbober commented 8 years ago

See http://www.maths.bris.ac.uk/~jb12407/bober-modform-coefficients

I suppose that this is really the first thing I should have done.

For each space (level, weight, chi) where the LMFDB has data and I have data, I've printed out the first 50 coefficients of all of the forms in the space. The format is just

label ; coefficients

where the label is of the form level.weight.chi.embedding, and the coefficients are a space-separated list of python-formatted complex numbers. So, (if I can get this right on the first try without running the code), to import these into python you would do something like

for line in open('bober-modform-coefficients'): label, coefficients = line.split(';') level, weight, chi, embedding = [int(x) for x in label.split('.')] coefficients = [complex(x) for x in coefficients.split()]

do something

My embeddings are tied up with the characters, so for comparisons with the LMFDB you would have to consider all of the characters in a Galois orbit at the same time. You can do something like orbit = [psi.number() for psi in DirichletGroup_conrey(level)[chi].galois_orbit()] to get the labels of all of the characters in a Galois orbit.

As far as I'm concerned, my embeddings are currently arbitrary, and also subject to change because for some silly reason I computed forms for chi and 1/chi separately. I should at least make sure that complex conjugate forms have the same embedding number.

On Tue, May 10, 2016 at 1:55 PM, Fredrik Strömberg <notifications@github.com

wrote:

That’s alright. I added a testing function (probably very slow…) which compares the coefficients against what sage gives from the stored subspace. which is able to test the forms you found at least. Fredrik

On 10 May 2016, at 13:53, Jonathan Bober notifications@github.com wrote:

On Tue, May 10, 2016 at 1:13 PM, Fredrik Strömberg < notifications@github.com

wrote:

Ok, it is just a bit hard to know since we don’t see what you are seeing :) So this was just again another problem coming from the recursive formula.

Could you perhaps upload your forms to the database so we can all run your checks? Since I am going through your lists (in particular the wrong a(p)’s have to be looked at individually it would be good to be able to see how the fixes progress.

I don't want to pollute the database more, but maybe I can produce a simple text file with lists of the coefficients that I have for everything that should be in the LMFDB right now.

I have something like 4 to 5 TB of data stored in an awkward format. I'm not trying to hide it, but it also isn't very easy to share right now.

Eventually, we should figure out how to fill in the blanks in the LMFDB with "embedded modular forms" (or whatever they should be called) when we have embeddings of all of the modular forms but not the algebraic information. (This all started because I just wanted to match the LMFDB data with what I had so that I could put in modular form L-functions.) — You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub < https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-218148061>

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-218148510

jwbober commented 8 years ago

I also have some information about coefficient fields. See http://www.maths.bris.ac.uk/~jb12407/newform-fields/

There are a bunch of files there, one for every weight and level in [2,12] x [1, 100](except for level 2, where my code runs in an infinite loop for some reason).

As an example, consider http://www.maths.bris.ac.uk/~jb12407/newform-fields/newformfields.36.7

It looks like

36.7.5.0 36.7.5.1 36.7.5.2 36.7.5.3 36.7.5.4 36.7.5.5 36.7.29.0 36.7.29.1 36.7.29.2 36.7.29.3 36.7.29.4 36.7.29.5 x^12-6*x^11-69*x^10-12906*x^9+364095*x^8-9412848*x^7+17911530*x^6-6861966192*x^5+193495010895*x^4-5000048831034*x^3-19487638017189*x^2-1235346792567894*x+150094635296999121
36.7.7.0 36.7.7.1 36.7.7.2 36.7.7.3 36.7.7.4 36.7.7.5 36.7.7.6 36.7.7.7 36.7.7.8 36.7.7.9 36.7.7.10 36.7.7.11 36.7.7.12 36.7.7.13 36.7.7.14 36.7.7.15 36.7.7.16 36.7.7.17 36.7.7.18 36.7.7.19 36.7.7.20 36.7.7.21 36.7.7.22 36.7.7.23 36.7.7.24 36.7.7.25 36.7.7.26 36.7.7.27 36.7.7.28 36.7.7.29 36.7.7.30 36.7.7.31 36.7.7.32 36.7.7.33 36.7.31.0 36.7.31.1 36.7.31.2 36.7.31.3 36.7.31.4 36.7.31.5 36.7.31.6 36.7.31.7 36.7.31.8 36.7.31.9 36.7.31.10 36.7.31.11 36.7.31.12 36.7.31.13 36.7.31.14 36.7.31.15 36.7.31.16 36.7.31.17 36.7.31.18 36.7.31.19 36.7.31.20 36.7.31.21 36.7.31.22 36.7.31.23 36.7.31.24 36.7.31.25 36.7.31.26 36.7.31.27 36.7.31.28 36.7.31.29 36.7.31.30 36.7.31.31 36.7.31.32 36.7.31.33 x^68+x^67+x^66-88*x^65-88*x^64+1136*x^63+139176*x^62-194112*x^61-845136*x^60+351777760*x^59+94167808*x^58-4195054208*x^57-156315393280*x^56+247717007360*x^55-1903352338432*x^54+37210141900800*x^53-196149009678336*x^52-1916424711438336*x^51+36765066850271232*x^50+28581317074747392*x^49-1842190557241147392*x^48-39074788091784855552*x^47+149764255643439464448*x^46-412829668904854880256*x^45+9545807364713367994368*x^44-79457646879649805893632*x^43-47120249292909868744704*x^42+2437093231249267963723776*x^41+32942642404035514500907008*x^40-217684034565155605828337664*x^39-3165472873850954552671469568*x^38+30688565922870515786651271168*x^37+31288038133852929196321406976*x^36+1306945832835585082068540850176*x^35-14424066519619341343937431339008*x^34+83644533301477445252386614411264*x^33+128155804196261597988132482973696*x^32+8044823425284968490375910829064192*x^31-53107822146738216336352621979762688*x^30-233736452329669227045944314231259136*x^29+2263801148306485651262869787155365888*x^28+10718449382931016835209351472772808704*x^27-13263171072322110156580864209740365824*x^26-1431381715515709280206024845438524325888*x^25+11005566589612457089727054563629405831168*x^24-30461452993288432487966709812876525174784*x^23+707241701182499638208327118976320754679808*x^22-11809630055032239606667684149284834050572288*x^21-35633147668785352851551215156920619339087872*x^20+35381956782340696491398731745475606828023808*x^19+2912828691261081663172680407330292074337533952*x^18-9717427742988071833226654282999733370270777344*x^17-63653992922346344572072363467595145703919190016*x^16+772824411588635998545952475962133142219993907200*x^15-2529989214086500266003855674379945434216327544832*x^14+21073432397758083472965964679903109183764997079040*x^13-851061952183932137327301002186599339832749462650880*x^12-1461763046666120094168149914405875003749505645412352*x^11+2100012292112153602112492343440842557809754665648128*x^10+502073996305270905608550267772371832207617110038282240*x^9-77197977985455623042436836548906785381246012852535296*x^8-1134780023302304909033414680992892518045809511128629248*x^7+52071923680554430603242506311199079819170321305914310656*x^6+27201795850369527235860659731917988009232619418758938624*x^5-134859607596198219535534538389227208158449042752156991488*x^4-8631014886156686050274210456910541322140738736138047455232*x^3+6277101735386680763835789423207666416102355444464034512896*x^2+401734511064747568885490523085290650630550748445698208825344*x+25711008708143844408671393477458601640355247900524685364822016
36.7.17.0 36.7.17.1 x^2+4050
36.7.19.7 x-8
36.7.19.8 x+8
36.7.19.4 36.7.19.10 x^2+4*x+64
36.7.19.1 36.7.19.2 36.7.19.3 36.7.19.6 36.7.19.11 36.7.19.13 x^6-10*x^5-28*x^4+992*x^3-1792*x^2-40960*x+262144
36.7.19.0 36.7.19.5 36.7.19.9 36.7.19.12 x^4+76*x^2+4096

This means that the forms 36.7.19.0, 36.7.19.5, 36.7.19.9, and 36.7.19.12 have coefficient field generated by a root of x^4+76*x^2+4096.

36.7.7.0, etc are in a degree 68 extension.

In lots of cases, especially in higher level and weight, and for characters with large orbits, the polynomial is a ?. For example http://www.maths.bris.ac.uk/~jb12407/newform-fields/newformfields.65.9 starts with

65.9.3 ?
65.9.22 ?
65.9.42 ?
65.9.48 ?
[...]

Which means that right now I know nothing about the number field for those forms. (I guess I could give an upper bound for the degree.)

I should be choosing the minimal polynomial for a(2) when possible. (If all of the forms in the space, have distinct a(2).) In general I think it is the minimal polynomial of (a(2) + a(3) + a(4) + ... + a(n)) for some n. (I probably ought to be omitting composite coefficients there, but I don't think I was.)

jvoight commented 8 years ago

Sorry to drop this without reading the whole long thread, but hopefully someone who's been following the line-by-line can decide if this has already been fixed. Drew and I were playing around for actual math reasons and found http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/17/2/2/ with a "Problem with Hecke orbits in the database!" error.

sehlen commented 8 years ago

OK, so this is the usual reason. By a previous version of the program, the characters got confused and this one has not been replaced yet.

It is now accessible and should be correct. All of these issues will be resolved very soon but please keep informing us when you see something like that! Thanks!

Stephan

On Wed, 11 May 2016 at 15:37 jvoight notifications@github.com wrote:

Sorry to drop this without reading the whole long thread, but hopefully someone who's been following the line-by-line can decide if this has already been fixed. Drew and I were playing around for actual math reasons and found http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/17/2/2/ with a "Problem with Hecke orbits in the database!" error.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-218611061

JohnCremona commented 8 years ago

Would it be sensible to run a script to find all the records created before a certain date, using this "previous version" and delete them and replace them? You could do that in three passes: first, identify which objects these are; second, recompute those and put them into a new temporary collection; third, delete the bad ones and replace them (quickly) with the recomputed ones.

sehlen commented 8 years ago

We do something like that but sometimes there are little problems. But essentially we're running exactly such scripts now and the data will finally (we'll let you and everyone know when we think this is the case) be always correct.

On Wed, May 11, 2016, 19:21 John Cremona notifications@github.com wrote:

Would it be sensible to run a script to find all the records created before a certain date, using this "previous version" and delete them and replace them? You could do that in three passes: first, identify which objects these are; second, recompute those and put them into a new temporary collection; third, delete the bad ones and replace them (quickly) with the recomputed ones.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-218642733

jwbober commented 8 years ago

How much is still wrong? And are you recomputing everything, or just what I identified? (It seems ridiculous how long Sage takes to compute embeddings precisely. But maybe I just don't understand how hard that is.)

AndrewVSutherland commented 8 years ago

@fredstro,@sehlen,@jwbober, can't Magma compute some of these? Would that not be a way to get another independent sanity check of (at least some of) the data?

I realize that the first priority is to fix the cases that are obviously broken, but we could still very easily miss problems that are less obvious. We should be looking to do any and all validation/verification of the data that we can, ideally using different computational methods.

sehlen commented 8 years ago

Yes, we'll do it. I wanted to use Magma for a long time to generate data as well, which we should start doing once we fixed everything.

I'm now traveling so I'll be offline for a while but I'll continue working on it shortly. It shouldn't take too long anymore.

On Thu, May 12, 2016, 05:49 AndrewVSutherland notifications@github.com wrote:

@fredstro https://github.com/fredstro,@sehlen https://github.com/sehlen,@jwbober https://github.com/jwbober, can't Magma compute some of these? Would that not be a way to get another independent sanity check of (at least some of) the data?

I realize that the first priority is to fix the cases that are obviously broken, but would could still very easily miss problems that are less obvious. We should be looking to do any and all validation/verification of the data that we can, ideally using different computational methods.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-218746932

fredstro commented 8 years ago

As Stephan said, we have definitely been thinking about using magma even to compute the data in some cases when sage is ridiculously slow. The big issue which needs to be solved first is how to represent the modular forms independently of method of computing. At the moment we use the information given by sage in a way which William at some point determined was good (for instance a Hecke orbit is determined by: a matrix, a dual basis matrix 'B', a dual eigenvector 'v' and a vector which tells you something about non-zero entries) . See https://github.com/sehlen/modforms-db/blob/refactor/databases.md for more information if you are interested.

For those of you who don't know about some of the issues with this storage: One of the problem we have been having with non-trivial characters is that B has coefficients in the base ring and v has coefficients in the eigenvalue field and multiplying these two together (which is how you will get coefficients) can be extremely slow in sage. We therefore tried to coerce B explicitly to have the same base as v but that for some reason (we store B as a sage matrix object) sometimes increased the size of the storage required for B by a factor 1000 (so we stopped doing that). The other problem we are having now is that to compute the embeddings of coefficients accurately we are using a routine which goes over qqbar in sage which is again extremely slow. We are open for any constructive (and concrete) suggestions! (should probably be in a new issue or something...)

On Thu, May 12, 2016 at 2:06 PM Stephan Ehlen notifications@github.com wrote:

Yes, we'll do it. I wanted to use Magma for a long time to generate data as well, which we should start doing once we fixed everything.

I'm now traveling so I'll be offline for a while but I'll continue working on it shortly. It shouldn't take too long anymore.

On Thu, May 12, 2016, 05:49 AndrewVSutherland notifications@github.com wrote:

@fredstro https://github.com/fredstro,@sehlen https://github.com/sehlen,@jwbober https://github.com/jwbober, can't Magma compute some of these? Would that not be a way to get another independent sanity check of (at least some of) the data?

I realize that the first priority is to fix the cases that are obviously broken, but would could still very easily miss problems that are less obvious. We should be looking to do any and all validation/verification of the data that we can, ideally using different computational methods.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-218746932

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-218750889

jwbober commented 8 years ago

My code is robust, and I believe that it is basically bug-free. I'm doing floating point computations, but I use arb, so I have rigorous error bounds on all of my coefficients. I don't use multiplicativity to compute coefficients, but I check that a(nm) intersects a(n)a(m) for coprime (n,m), which is a strong check. (I suppose I could also check the relations between a(p) and a(p^k). Also, as I mentioned elsewhere, Dave Platt has checked RH at small height, which is an additional check of sorts. When I can identify the coefficient fields, it is because I multiplied a bunch of (x - a) together to get an acb_poly_t which contains exactly one integer polynomial, which is an additional sort of check on the small coefficients.

So checks against my data should be good enough. It is new, and there may still be some errors/bugs, but if it agrees with what we have in the LMFDB then we should be in good shape. There are some gaps due to different ranges of data. Maybe I can fill some in.

I've decided to try a bit more seriously to actually find coefficient fields and Galois orbits information, so I've started computing to higher precision in spaces where I haven't been able to do this. (The spaces in the LMFDB tend to have fairly small dimension, so they should be easier.) As a first goal, I'll see if I can identify the coefficient fields for all chi for weight <= 12 and level <= 100.

But anyway, we should also decide what to do when we only have embeddings, because we can get embeddings much more easily than exact algebraic numbers, which will quickly become intractable no matter how you try to compute them.

sehlen commented 8 years ago

Most cases should have been fixed by Fredrik and also some by me last week (mostly before the release) but there are issues left for sure and for many reasons which I can in part explain somewhere else if you really want to know ;-) Trivial character data should be in very good shape.

Let me quickly outline the timeline that I currently have in mind (and maybe you want to assign me if you want):

1) I am working on cleaning up code for "backend", "frontend" and our computation classes (modforms-db). I want to do it carefully and I am improving many things along the way. This delays fixing the data as far as I'm concerned but it helps in making sure it will be correct and working smoothly later on so I prefer to do this. I can't work on it full-time but should be able to have my current goals finished by Tuesday or Wednesday (By that time we'll also be ready to integrate the weight one data and while doing this I will set up instructions how other people can make their data LMFDB-ready soon.)

2) Then verification, recomputation and extension of the data should start. A first step would be to try a few cases, then run it semi-automatically on small "rectangles" of parameters and then on all of the data. @AndrewVSutherland has offered to run this on GCE if I understand him correctly. Maybe we could do this together as I would like to see the running processes if possible.

3) After that I need to take a break but then the next steps for classical modular forms would be implementing more features. Our data structure and our classes already allow for a lot of things to be implemented really easily so this should go pretty fast and as I add more documentation, other people can help more easily.

By the way, let me also mention that even though Kevin Buzzard states here that he has contacted LMFDB people to ask if we could incorporate weight one data, I don't think I have ever been asked. Maybe @fredstro ? I don't understand the comment about the labeling system in this context as we do have labels for newforms which just apply as they are for weight one. With George Schaeffer, I was able to get some examples that he computed into the db during the AIM workshop and even though I started duplicating code in the beginning I realized very soon that what we have is flexible enough to work with other people's data almost as-is. So I don't know why anyone would have refused their data. I certainly wouldn't have and we're getting it now after asking for their permission (before I even heard about the blog post, btw.).

AndrewVSutherland commented 8 years ago

On 2016-05-15 20:12, Stephan Ehlen wrote:

Most cases should have been fixed by Fredrik and also some by me last week (mostly before the release) but there are issues left for sure and for many reasons which I can in part explain somewhere else if you really want to know ;-) Trivial character data should be in very good shape.

Let me quickly outline the timeline that I currently have in mind (and maybe you want to assign me if you want):

1) I am working on cleaning up code for "backend", "frontend" and our computation classes (modforms-db). I want to do it carefully and I am improving many things along the way. This delays fixing the data as far as I'm concerned but it helps in making sure it will be correct and working smoothly later on so I prefer to do this. I can't work on it full-time but should be able to have my current goals finished by Tuesday or Wednesday (By that time we'll also be ready to integrate the weight one data and while doing this I will set up instructions how other people can make their data LMFDB-ready soon.)

Excellent!

2) Then verification, recomputation and extension of the data should start. A first step would be to try a few cases, then run it semi-automatically on small "rectangles" of parameters and then on all of the data. @AndrewVSutherland has offered to run this on GCE if I understand him correctly. Maybe we could do this together as I would like to see the running processes if possible.

Great, let's plan to do this. My schedule is pretty wide open this week, so I am at your disposal. Send me your id_rsa.pub and I should be able to set it up so you can log directly into the cloud servers. I assume we need Sage 7.1 installed on them, anything else? (We can't run Magma in the cloud, but I do have 100+ cores at MIT where I can run Magma).

3) After that I need to take a break but then the next steps for classical modular forms would be implementing more features. Our data structure and our classes already allow for a lot of things to be implemented really easily so this should go pretty fast and as I add more documentation, other people can help more easily.

By the way, let me also mention that even though Kevin Buzzard states here that he has contacted LMFDB people to ask if we could incorporate weight one data, I don't think I have ever been asked. Maybe @fredstro ? I don't understand the comment about the labeling system in this context as we do have labels for newforms which just apply as they are for weight one. With George Schaeffer, I was able to get some examples that he computed into the db during the AIM workshop and even though I started duplicating code in the beginning I realized very soon that what we have is flexible enough to work with other people's data almost as-is. So I don't know why anyone would have refused their data. I certainly wouldn't have and we're getting it now after asking for their permission (before I even heard about the blog post, btw.).

Let me know if there is anything I can do to help with the process of getting the data from http://people.maths.ox.ac.uk/lauder/weight1/ integrated into your framework.


You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-219323489

sehlen commented 8 years ago

On May 15, 2016, at 22:00, Andrew Sutherland notifications@github.com wrote:

On 2016-05-15 20:12, Stephan Ehlen wrote:

Most cases should have been fixed by Fredrik and also some by me last week (mostly before the release) but there are issues left for sure and for many reasons which I can in part explain somewhere else if you really want to know ;-) Trivial character data should be in very good shape.

Let me quickly outline the timeline that I currently have in mind (and maybe you want to assign me if you want):

1) I am working on cleaning up code for "backend", "frontend" and our computation classes (modforms-db). I want to do it carefully and I am improving many things along the way. This delays fixing the data as far as I'm concerned but it helps in making sure it will be correct and working smoothly later on so I prefer to do this. I can't work on it full-time but should be able to have my current goals finished by Tuesday or Wednesday (By that time we'll also be ready to integrate the weight one data and while doing this I will set up instructions how other people can make their data LMFDB-ready soon.)

Excellent!

2) Then verification, recomputation and extension of the data should start. A first step would be to try a few cases, then run it semi-automatically on small "rectangles" of parameters and then on all of the data. @AndrewVSutherland has offered to run this on GCE if I understand him correctly. Maybe we could do this together as I would like to see the running processes if possible.

Great, let's plan to do this. My schedule is pretty wide open this week, so I am at your disposal. Send me your id_rsa.pub and I should be able to set it up so you can log directly into the cloud servers. I assume we need Sage 7.1 installed on them, anything else? (We can't run Magma in the cloud, but I do have 100+ cores at MIT where I can run Magma).

Great - will do. We need Sage 7.1 and one mongodb instance (or a couple?) to which all instances could connect to would be great. Also, all instances should then have the lmfdb and our modforms-db repository checked out. I did this previously myself by using a disk from a snapshot in GCE, maybe I could even supply you with my latest working image somehow?

Using Magma would also be great as sage is pretty slow (and fails) to compute for non-trivial characters rather quickly because the number field degree gets large and linear algebra over number fields is slow in sage. But preparing to be able to use that data might take a little longer.

3) After that I need to take a break but then the next steps for classical modular forms would be implementing more features. Our data structure and our classes already allow for a lot of things to be implemented really easily so this should go pretty fast and as I add more documentation, other people can help more easily.

By the way, let me also mention that even though Kevin Buzzard states here that he has contacted LMFDB people to ask if we could incorporate weight one data, I don't think I have ever been asked. Maybe @fredstro ? I don't understand the comment about the labeling system in this context as we do have labels for newforms which just apply as they are for weight one. With George Schaeffer, I was able to get some examples that he computed into the db during the AIM workshop and even though I started duplicating code in the beginning I realized very soon that what we have is flexible enough to work with other people's data almost as-is. So I don't know why anyone would have refused their data. I certainly wouldn't have and we're getting it now after asking for their permission (before I even heard about the blog post, btw.).

Let me know if there is anything I can do to help with the process of getting the data from http://people.maths.ox.ac.uk/lauder/weight1/ integrated into your framework.


You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-219323489 — You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-219333748

JohnCremona commented 8 years ago

Buzzard specifically said that he had "stopped bugging Cremona about" including weight 1 forms. I'm sure he must have mentioned this to me at least once, I was never "bugged" and might well have said something along the lines of "it's a lot more work thatn you might think".

jwbober commented 8 years ago

On Mon, May 16, 2016 at 1:12 AM, Stephan Ehlen notifications@github.com wrote:

Most cases should have been fixed by Fredrik and also some by me last week (mostly before the release) but there are issues left for sure and for many reasons which I can in part explain somewhere else if you really want to know ;-) Trivial character data should be in very good shape.

Let me quickly outline the timeline that I currently have in mind (and maybe you want to assign me if you want):

1) I am working on cleaning up code for "backend", "frontend" and our computation classes (modforms-db). I want to do it carefully and I am improving many things along the way. This delays fixing the data as far as I'm concerned but it helps in making sure it will be correct and working smoothly later on so I prefer to do this. I can't work on it full-time but should be able to have my current goals finished by Tuesday or Wednesday (By that time we'll also be ready to integrate the weight one data and while doing this I will set up instructions how other people can make their data LMFDB-ready soon.)

I have said this a few times, but I'll repeat it again. It would be useful to think about how to deal with cases where we know a full space of modular forms, but only by complex embeddings of the coefficients. We can go much further with numerical computation than with exact computation. There's also an intermediate case where we may know the Galois orbit structure and a minimal polynomial for a generator of each field of coefficient, but still don't have the coefficients expressed as algebraic numbers.

For an extreme example of some data that I have: In level 83, weight 4, there is a single Galois orbit of 40 nontrivial even characters, and a single 20 dimensional newform per character, so the coefficients live in a degree 800 extension of the rational numbers. If my computations are correct I have the minimal polynomial of a(2), which is x^800+41*x^799+[ 189000 more characters ]. (I tried asking Sage for the discriminant of this polynomial and it crashed.) Is that something that Sage or Magma is ever going to be able to compute exactly? Even if computing this space is possible, I'll just come up with something harder, of course.

2) Then verification, recomputation and extension of the data should start. A first step would be to try a few cases, then run it semi-automatically on small "rectangles" of parameters and then on all of the data. @AndrewVSutherland https://github.com/AndrewVSutherland has offered to run this on GCE if I understand him correctly. Maybe we could do this together as I would like to see the running processes if possible.

Of course I don't want to discourage you from doing this, and I hope it is more tractable than I think, but I'm not sure how useful/cost effective GCE will be here. Last time I tried to do modular form computations in Sage for general characters I found that things were very slow and used a lot of memory. As in, computation of a single space may take more than 24 hours (limiting the use of preemptible instances) and many gigabytes of memory (to the point where you might not be able to make use of all cores on a single GCE high-cpu machine). Obviously, Stephan and Fredrik have more experience with this than me, and may know something about making these computations run better, but I guess I'm saying that if you want to do some really big computation on GCE then the most straightforward thing to do will be to focus on the more tractable cases, like trivial/small order character.

(I read more about GCE recently after Drew's comment about 70000 cores, or whatever the count is. That's a lot of CPU.)

At some point, we should also be trying Pari's new modular form functionality to see how well it works. It's probably not in any release yet, but there is something in some git branch, if not in the current development version.

3) After that I need to take a break but then the next steps for classical modular forms would be implementing more features. Our data structure and our classes already allow for a lot of things to be implemented really easily so this should go pretty fast and as I add more documentation, other people can help more easily.

By the way, let me also mention that even though Kevin Buzzard states here https://galoisrepresentations.wordpress.com/2016/05/12/lmfdb/ that he has contacted LMFDB people to ask if we could incorporate weight one data, I don't think I have ever been asked. Maybe @fredstro https://github.com/fredstro ? I don't understand the comment about the labeling system in this context as we do have labels for newforms which just apply as they are for weight one. With George Schaeffer, I was able to get some examples that he computed into the db during the AIM workshop and even though I started duplicating code in the beginning I realized very soon that what we have is flexible enough to work with other people's data almost as-is. So I don't know why anyone would have refused their data. I certainly wouldn't have and we're getting it now after asking for their permission (before I even heard about the blog post, btw.).

I didn't know about these weight 1 computations either, and I didn't realize that Magma would be able to go as far as they did. So that could be an indication that everything I think I know is wrong. On the other hand, in weight 1 the dimensions and the coefficients are much smaller, so although the algorithms for finding a basis are messier, the number fields are much simpler, which makes a lot of things easier.

sehlen commented 8 years ago

@jwbober I agree we should also have numerical data. But it is something new and we should see how useful it would be and how we could integrate it. We could put it on the list of things to do at the next workshop. Or maybe you have a good idea right now.

Your extreme case seems quite impressive and I don't think it is in reach. I'm surprised to see that the degree goes that high for level 83 and weight 4 (!), I have to say. But I guess that explains why we haven't succeeded computing it (afaik, correct, @fredstro ) ;-) It makes me think if we should really work on non-trivial or quadratic characters at all... but then, many cases are doable and not all are so bad and we should probably still compute what we can.

Trying pari's new implementation is on the list of things I would like to do.

sehlen commented 8 years ago

@jwbober On the other hand, degree 800 means that you get a degree 20 extension of the cyclotomic field of order 41 in this case and maybe that is doable. I'll check out this case tomorrow ;-) But, as you said, you can probably always come up with a more complicated one but it would be nice to have non-trivial characters in a reasonable range in the lmfdb, afaic.

sehlen commented 8 years ago

Let me add that I know of course that we can't get very far with non-trivial characters (although seeing this for level 83 and weight 4 already came a bit as a shock although I should've known this) but I guess a reasonable goal would be to complete and validate the squares of parameters currently advertised on the lmfdb and then compute data for characters of order <=2 as far as we can.

I would love to see numerical modular forms in the lmfdb as well but it's not completely clear to me how we should do it and how useful it is. Also, we just never talked about it before we had this conversation here... (I just want to say, i don't think I ever refused this nor did I ignore it)

On Mon, May 16, 2016, 21:42 Jonathan Bober notifications@github.com wrote:

On Mon, May 16, 2016 at 1:12 AM, Stephan Ehlen notifications@github.com wrote:

Most cases should have been fixed by Fredrik and also some by me last week (mostly before the release) but there are issues left for sure and for many reasons which I can in part explain somewhere else if you really want to know ;-) Trivial character data should be in very good shape.

Let me quickly outline the timeline that I currently have in mind (and maybe you want to assign me if you want):

1) I am working on cleaning up code for "backend", "frontend" and our computation classes (modforms-db). I want to do it carefully and I am improving many things along the way. This delays fixing the data as far as I'm concerned but it helps in making sure it will be correct and working smoothly later on so I prefer to do this. I can't work on it full-time but should be able to have my current goals finished by Tuesday or Wednesday (By that time we'll also be ready to integrate the weight one data and while doing this I will set up instructions how other people can make their data LMFDB-ready soon.)

I have said this a few times, but I'll repeat it again. It would be useful to think about how to deal with cases where we know a full space of modular forms, but only by complex embeddings of the coefficients. We can go much further with numerical computation than with exact computation. There's also an intermediate case where we may know the Galois orbit structure and a minimal polynomial for a generator of each field of coefficient, but still don't have the coefficients expressed as algebraic numbers.

For an extreme example of some data that I have: In level 83, weight 4, there is a single Galois orbit of 40 nontrivial even characters, and a single 20 dimensional newform per character, so the coefficients live in a degree 800 extension of the rational numbers. If my computations are correct I have the minimal polynomial of a(2), which is x^800+41*x^799+[ 189000 more characters ]. (I tried asking Sage for the discriminant of this polynomial and it crashed.) Is that something that Sage or Magma is ever going to be able to compute exactly? Even if computing this space is possible, I'll just come up with something harder, of course.

2) Then verification, recomputation and extension of the data should start. A first step would be to try a few cases, then run it semi-automatically on small "rectangles" of parameters and then on all of the data. @AndrewVSutherland https://github.com/AndrewVSutherland has offered to run this on GCE if I understand him correctly. Maybe we could do this together as I would like to see the running processes if possible.

Of course I don't want to discourage you from doing this, and I hope it is more tractable than I think, but I'm not sure how useful/cost effective GCE will be here. Last time I tried to do modular form computations in Sage for general characters I found that things were very slow and used a lot of memory. As in, computation of a single space may take more than 24 hours (limiting the use of preemptible instances) and many gigabytes of memory (to the point where you might not be able to make use of all cores on a single GCE high-cpu machine). Obviously, Stephan and Fredrik have more experience with this than me, and may know something about making these computations run better, but I guess I'm saying that if you want to do some really big computation on GCE then the most straightforward thing to do will be to focus on the more tractable cases, like trivial/small order character.

(I read more about GCE recently after Drew's comment about 70000 cores, or whatever the count is. That's a lot of CPU.)

At some point, we should also be trying Pari's new modular form functionality to see how well it works. It's probably not in any release yet, but there is something in some git branch, if not in the current development version.

3) After that I need to take a break but then the next steps for classical modular forms would be implementing more features. Our data structure and our classes already allow for a lot of things to be implemented really easily so this should go pretty fast and as I add more documentation, other people can help more easily.

By the way, let me also mention that even though Kevin Buzzard states here https://galoisrepresentations.wordpress.com/2016/05/12/lmfdb/ that he has contacted LMFDB people to ask if we could incorporate weight one data, I don't think I have ever been asked. Maybe @fredstro https://github.com/fredstro ? I don't understand the comment about the labeling system in this context as we do have labels for newforms which just apply as they are for weight one. With George Schaeffer, I was able to get some examples that he computed into the db during the AIM workshop and even though I started duplicating code in the beginning I realized very soon that what we have is flexible enough to work with other people's data almost as-is. So I don't know why anyone would have refused their data. I certainly wouldn't have and we're getting it now after asking for their permission (before I even heard about the blog post, btw.).

I didn't know about these weight 1 computations either, and I didn't realize that Magma would be able to go as far as they did. So that could be an indication that everything I think I know is wrong. On the other hand, in weight 1 the dimensions and the coefficients are much smaller, so although the algorithms for finding a basis are messier, the number fields are much simpler, which makes a lot of things easier.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-219597748

JohnCremona commented 8 years ago

If someone felt like getting message back to our critica who like to blog anonymously , @jwbober 's example might be worth mentioning: "only" level 83 weight 4, but coefficent field of degree 800 and 200,000 characters needed for one of the first Fourier coefficients. The mathamactical definitions are simple enough, but no-one has ever seen this newform with their nakes eyes before.

jwbober commented 8 years ago

Level 83 is an extreme case because it is the upper half of a Germain prime pair, so there are two large orbits of Dirichlet characters. As Stephan says, this could in principle be somewhat easier than it seems, because it is a "only" a degree 20 extension of a relatively simple degree 40 extension. This certainly makes things easier for me, because it means that I can do computations with just 20 embeddings at a time. But even with a working precision of 2000 bits, there are many fields in level < 100 and weight <= 12 that I haven't been able to identify yet (like level 83 and weight 5).

And I know that you haven't been ignoring me, Stephan. I just wanted to bring it up again (and show off this polynomial I computed yesterday, I suppose). I also don't know yet what we should do with numerical data.

(I'm trying to compute the space in level 83 weight 4 using Magma now. I'll report back eventually when it finishes, if ever, or uses too much memory and crashes.)

AndrewVSutherland commented 8 years ago

@jwbober aside from the question of how to present the data, it would be already be useful just to have an independently computed decomposition of the dimensions of each of the spaces into Hecke orbits, along with the degrees of the coefficient fields and their labels in the LMFDB when the degree is small enough. We could use this as a crude sanity check on the data (I know you have already pointed out several errors in the data that are visible at this level).

Is this something you could make available in some range of levels and weights? This morning I plan to update the data for levels N <= 24 and weight 2<=w<40 for the trivial character on www.lmfdb.org (see issue #1376) which now appears to be complete (or at least internally consistent), but any additional sanity checks we can do would be worth doing.

JohnCremona commented 8 years ago

If it were practical to have partial information for these "hard" case newforms, so that on the newform page it could show most of the newforms data, including the degree of the Hecke field if not its defining equation, but not the q-expansion, that would be a lot better than saying nothing. I can imagine page similar to http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/23/12/1/b/ where at the top it just says "q-expansion not available" and at the bottom the "Download Fourier coefficients" section is missing. That does not leave a lot of information on the page, but it would leave some.

AndrewVSutherland commented 8 years ago

@JohnCremona I agree, in fact I might argue that we should do that even in cases where we do have the q-expansion but it is too large to display in any reasonable fashion, e.g.

http://beta.lmfdb.org/ModularForm/GL2/Q/holomorphic/23/38/1/a/

This page takes a very long time (10-20 seconds) to load, and the screen appears to be blank after it does (the coefficients are there, but you have to scroll down to see anything past the first two). I'm not sure this display is going to be useful to any one.

So we could have pages in 3 classes:

(1) coefficients available and displayed (2) coefficients available and not displayed (3) coefficients not available

In all three cases we would still display all the other invariants. This would let us "fill out" our box very quickly, and we could then go back and add coefficient data in the cases where where we really do want to display and/or make them available for download.

I don't think it would be at all difficult to change the modular forms pages to accommodate this.

JohnCremona commented 8 years ago

@AndrewVSutherland That is a good example indeed -- on my desktop it takes an absurd length of time to render the q-expansion (stil on 35%, much slower than on yours!). Perhaps we should make your proposal more widely?

jwbober commented 8 years ago

I put some data that I had up at a link I posted on somewhere on this or another issue. I have some more now which I'll post tonight, possibly in a more useful way.

Also I realized while walking to work that I may be doing way more work than I need to be doing. Maybe after actually giving some thought to what I'm doing I can identify orbit structure much more quickly.

@jwbober https://github.com/jwbober aside from the question of how to present the data, it would be already be useful just to have an independently computed decomposition of the dimensions of each of the spaces into Hecke orbits, along with the degrees of the coefficient fields and their labels in the LMFDB when the degree is small enough. We could use this as a crude sanity check on the data (I know you have already pointed out several errors in the data that are visible at this level).

Is this something you could make available in some range of levels and weights? This morning I plan to update the data for levels N <= 24 and weight 2<=w<40 for the trivial character on www.lmfdb.org (see issue #1376 https://github.com/LMFDB/lmfdb/issues/1376) which now appears to be complete (or at least internally consistent), but any additional sanity checks we can do would be worth doing.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-219684831

sehlen commented 8 years ago

I think we agreed at Aim last week that we will truncate / hide such q-expansions. This is coming up once I cleaned up things properly (Because it's only really useful if the page will load faster (not only render) which will depend on having smaller records for displaying the pages which I'm working on) Maybe I should have created an issue ;-)

On Tue, May 17, 2016, 07:52 John Cremona notifications@github.com wrote:

@AndrewVSutherland https://github.com/AndrewVSutherland That is a good example indeed -- on my desktop it takes an absurd length of time to render the q-expansion (stil on 35%, much slower than on yours!). Perhaps we should make your proposal more widely?

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-219695507

JohnCremona commented 8 years ago

OK , I suppose that if alpha is a_2 (coefficient of q^2) and generates the field then we might as well show tha q-expansion as q+\alpha q^2+\dots where the coefficient field is Q(alpha) and alpha satisfies (some polynomial) as that does convey slightly more information than just saying what the coeffieicnt field is as it tells you a_2. How often does a_2 generate the field? I don't know why Sage (I presume) uses a_2 to generate the field rather than finding a nocer generator and expressing all a_n interms of that. Is it that doing a polredabs is too expensive? (And I realise that more than that is requied in the case of relative extensions, i.e. characters of degree >2).

sehlen commented 8 years ago

BTW, I also have magma running to compute this high degree space since yesterday evening (after our discussion) and it's still running.

On Tue, May 17, 2016, 08:14 Jonathan Bober notifications@github.com wrote:

I put some data that I had up at a link I posted on somewhere on this or another issue. I have some more now which I'll post tonight, possibly in a more useful way.

Also I realized while walking to work that I may be doing way more work than I need to be doing. Maybe after actually giving some thought to what I'm doing I can identify orbit structure much more quickly.

@jwbober https://github.com/jwbober aside from the question of how to present the data, it would be already be useful just to have an independently computed decomposition of the dimensions of each of the spaces into Hecke orbits, along with the degrees of the coefficient fields and their labels in the LMFDB when the degree is small enough. We could use this as a crude sanity check on the data (I know you have already pointed out several errors in the data that are visible at this level).

Is this something you could make available in some range of levels and weights? This morning I plan to update the data for levels N <= 24 and weight 2<=w<40 for the trivial character on www.lmfdb.org (see issue #1376 https://github.com/LMFDB/lmfdb/issues/1376) which now appears to be complete (or at least internally consistent), but any additional sanity checks we can do would be worth doing.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-219684831

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-219699826

AndrewVSutherland commented 8 years ago

OK, I opened a separate issue for this, because I think it is something we can work on in parallel and possibly address more quickly. Feel free to comment at https://github.com/LMFDB/lmfdb/issues/1385.

In order to keep the discussions productive and avoid duplication, lets focus in this thread on the problem of computing and verifying the data, and use #1385 to discuss what information should be presented to the user and how.

AndrewVSutherland commented 8 years ago

@jwbober regarding https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-219699826: great, as soon as you post this data I can put a script together to compare it against the data currently in mongo db, and we can also take about partially filling in missing entries (as suggested in #1385).

sehlen commented 8 years ago

@AndrewVSutherland, @jwbober, I think Bober has such a script. Maybe you could add it to the repository?

On Tue, May 17, 2016, 09:04 Andrew Sutherland notifications@github.com wrote:

@jwbober https://github.com/jwbober regarding #1248 (comment) https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-219699826: great, as soon as you post this data I can put a script together to compare it against the data currently in mongo db, and we can also take about partially filling in missing entries (as suggested in #1385 https://github.com/LMFDB/lmfdb/issues/1385).

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-219711011

AndrewVSutherland commented 8 years ago

On 2016-05-17 09:07, Stephan Ehlen wrote:

@AndrewVSutherland, @jwbober, I think Bober has such a script. Maybe you could add it to the repository?

Will do.

On Tue, May 17, 2016, 09:04 Andrew Sutherland notifications@github.com wrote:

@jwbober https://github.com/jwbober regarding #1248 (comment) https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-219699826: great, as soon as you post this data I can put a script together to compare it against the data currently in mongo db, and we can also take about partially filling in missing entries (as suggested in #1385 https://github.com/LMFDB/lmfdb/issues/1385).

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-219711011


You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-219711522

jwbober commented 8 years ago

I just updated the data at http://www.maths.bris.ac.uk/~jb12407/newform-fields/. (See the README file, or http://www.maths.bris.ac.uk/~jb12407/newform-fields/newformfields.83.4 for my earlier fun example.) It is still incomplete, and not very convenient, but that's all I'm going to do for tonight.

I think it may always be incomplete, even in the fairly small range covered (already, identifying some of those spaces the way I'm doing this would require multiplying roots together to get a polynomial of degree > 10000 with integer coefficients) but now that I've got interested in actually finding the coefficient fields and Galois orbit structure, I'm going to see how far I can push this. Level <= 100 and weight <= 12 may be feasible.

Tomorrow I may actually look at this data and make it more readable, and also work a bit on identifying the rest of the spaces in level < 100.

AndrewVSutherland commented 8 years ago

@jwbober This is wonderful! In terms of comparing these with existing LMFDB entries, I guess I need to know (1) are you labeling characters in the same way (Conrey labels), and (2) do you know how to match up your Hecke orbits with those in the LMFDB (which according to the knowl are lexicographically ordered by coefficient trace sequences)?

jwbober commented 8 years ago

I am labelling characters the same way. There is some extra character information included in my labels, though, because the embeddings include a specific [embedding of a] character from the Galois orbit of each character. In principle, the character orbits could probably be deduced from those files (because any Galois orbit over Q has to include at least one embedding of each character), but you can also use something like DirichletGroup_conrey(N).galois_orbits() to get that information. (That returns a list of sets of characters, rather then integers, though, which is a design decision I now sometimes regret. Something like [ [chi.number() for chi in orbit] for orbit in DirichletGroup_conrey(N).galois_orbits()] ] gets a list of lists of integers.)

Right now the only sure way I know to match up Hecke orbits is to use the embeddings of the coefficients. I haven't tried to order the orbits "from scratch". I've posted coefficients separately at http://www.maths.bris.ac.uk/~jb12407/bober-modform-coefficients, which is hopefully a somewhat self-explanatory format. (My attempts to match these are what led to my realization that so much of the data was wrong.)

On Wed, May 18, 2016 at 2:06 AM, Andrew Sutherland <notifications@github.com

wrote:

@jwbober https://github.com/jwbober This is wonderful! In terms of comparing these with existing LMFDB entries, I guess I need to know (1) are you labeling characters in the same way (Conrey labels), and (2) do you know how to match up your Hecke orbits with those in the LMFDB (which according to the knowl are lexicographically ordered by coefficient trace sequences)?

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-219898446

AndrewVSutherland commented 8 years ago

@jwbober If it's not much additional work (I'm guessing you might already have to do this much work to get the coefficient field anyway), do you think it would be feasible to include the traces of the first n Fourier coefficients (for some reasonably small n that is big enough for us to expect these will distinguish the Hecke orbits, maybe 20 or 100?). These would also provide a nice way to compare q-expansions that does not require everyone to represent the number field the same way (or require us to find an isomorphism or the polredabs of a defining polynomial).

In fact, given that the label is based on this, we could presumably use these to define an intrinsic signature of each Hecke orbit that doesn't actually require us to identify the coefficient fields(s) in any canonical way (which may become very hard as the degree grows). For each space (N,k,chi) we only need enough traces to uniquely distinguish the Hecke orbits. I'm guessing it won't take very many traces in most cases, and this data might be small enough to display on the screen in cases where we don't want to display the coefficients.

sehlen commented 8 years ago

@jwbober Actually, Magma is better than I thought. Using not too much Ram (top shows 5.6g) it has been able to compute this space and seems to have taken around maybe 1.5 days (I didn't time it and there are so many commands I don't know in Magma ;-). Anyway, I'm saving it to a file now and come back to you once I know the minimal polynomial of a(2). I already printed it but on the stupid machine that I'm on, I only have screen and my buffer is set too small to scroll up to the beginning ;-)

But this shows that interfacing Magma for the lmfdb will be very helpful (I started sage around the same time - it is still running but the machine is also slower, I think. It uses about 10g virtual memory but the resident memory is actually not too bad with around 4.6g). I'll let you know if it terminates.

JohnCremona commented 8 years ago

Stephan, feel free to use atkin or lehner which have magma and 64g each.

On 18 May 2016 at 18:04, Stephan Ehlen notifications@github.com wrote:

@jwbober https://github.com/jwbober Actually, Magma is better than I thought. Using not too much Ram (top shows 5.6g) it has been able to compute this space and seems to have taken around maybe 1.5 days (I didn't time it and there are so many commands I don't know in Magma ;-). Anyway, I'm saving it to a file now and come back to you once I know the minimal polynomial of a(2). I already printed it but on the stupid machine that I'm on, I only have screen and my buffer is set too small to scroll up to the beginning ;-)

But this shows that interfacing Magma for the lmfdb will be very helpful (I started sage around the same time - it is still running but the machine is also slower, I think. It uses about 10g virtual memory but the resident memory is actually not too bad with around 4.6g). I'll let you know if it terminates.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/LMFDB/lmfdb/issues/1248#issuecomment-220093054