LMFDB / lmfdb-inventory

inventory of the lmfdb database
3 stars 14 forks source link

pull request for hmfs collections #55

Open JohnCremona opened 7 years ago

JohnCremona commented 7 years ago

I'm about to copy hmfs.forms.search and hmfs.forms.search.rand to the cloud. Later hmfs.forms.search.stats will follow (still being created). I'll do these myself. These are fairly small (mongodump bson file is 124M) and already exist on the cloud so this is simple.

After this at some point soon I'll want to copy hmfs.hecke to the cloud -- a new collection not there yet -- which is a bigger item (bson dump is >75G). It will replace hmfs.forms but for a short time we'll need to have both in place.

The only collections which are larger, as measured by their bson dump files, are modularforms2.vector_on_basis: 132G modularforms2.ap.chunks: 260G Lfunctions.Lfunctions: 206G

JohnCremona commented 7 years ago

OK so I just discovered that the update script is unhappy about a collection name not listed in db-collections.txt, which is reasonable, and I can add forms.search.rand to that but I can see that this file is part of the git repository lmfdb-gce so rather than edit my copy (in my home dir on ms) I have cloned the lmfdb-gce repo there and will use that.

Please will someone install emacs on ms?! I don't want to clone this repo on a machine of my own just for this...

JohnCremona commented 7 years ago

...OK, so I had a clone already so I updated the hmfs collections names. I'll make a PR to the lmfdb-gce repository. Meanwhile I did what I said I would for the forms.search (including .rand and .stats which are now in the list), checked all was well and deleted forms.search.old.

JohnCremona commented 7 years ago

Just started copying hmfs.hecke after checking with @edgarcosta that there was enough space (~85G). After the open PR's on lmfdb code are merged it will be possible to delete hmfs.forms. Of course we'll try that out on beta first (and not literally delete it at first).

JohnCremona commented 7 years ago

Finished copying hecke collection.

JohnCremona commented 7 years ago

Question for either @AndrewVSutherland ot @edgarcosta : Can I find out from the cloud when each collection was last updated? I know we talked about keeping a log of runs of the update script, but it would also be nice to be able to get a list of collections on the production database and when they were last updated. (I am not saying that I have forgotten which I have or not updated....honestly)

JohnCremona commented 7 years ago

I suppose that was too good to be true. beta is working fine but when I renamed hmfs.forms to hmfs.forms.bak it wasn't. I renamed it back again and will see what I did wrong.

AndrewVSutherland commented 7 years ago

@JohnCremona Regarding your question, short of scrolling through the mongo db log-file I don't know of an easy way to determine when a collection was last updated.

edgarcosta commented 7 years ago

I also don't know an easy way to do that. Unless we keep track of it somewhere is not a trivial thing. One way would be to keep track of it in some collection.

I checked the ObjectId of the restored/moved objects, and they keep the original timestamp. Hence we can't use that to keep track when the data was moved to the cloud, but we can use that to see when was the last object inserted in that collection.

On 23 August 2017 at 13:13, Andrew Sutherland notifications@github.com wrote:

@JohnCremona Regarding your question, short of scrolling through the mongo db log-file I don't know of an easy way to determine when a collection was last updated.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

JohnCremona commented 7 years ago

On 23 Aug 2017 19:45, "Edgar Costa" notifications@github.com wrote:

I also don't know an easy way to do that. Unless we keep track of it somewhere is not a trivial thing. One way would be to keep track of it in some collection.

I checked the ObjectId of the restored/moved objects, and they keep the original timestamp. Hence we can't use that to keep track when the data was moved to the cloud, but we can use that to see when was the last object inserted in that collection.

Perhaps we should do what drew suggested before and log all uses of the 2 scripts to a file. Meanwhile I'll check my bash history...

On 23 August 2017 at 13:13, Andrew Sutherland notifications@github.com wrote:

@JohnCremona Regarding your question, short of scrolling through the mongo db log-file I don't know of an easy way to determine when a collection was last updated.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LMFDB/lmfdb-inventory/issues/55#issuecomment-324426703, or mute the thread https://github.com/notifications/unsubscribe-auth/AC9N6WZYmqmqyCPAUxcoFUES3WU2w2JAks5sbHM9gaJpZM4O9KEV .

JohnCremona commented 7 years ago

OK so now on beta there is no collection hmfs.forms (it is renamed as hmfs.forms.bak; I did not use forms.old since that notation comes with an expectation that there is another collection without the '.old' suffix). All HMF pages on beta are working normally (please check). Next step is to push the latest code change to prod; then it will be safe to actually delete the hmfs.forms on prod since hmfs.hecke is already there. I'll do the code push now and check that it's OK on the hour...

edgarcosta commented 7 years ago

I tried some, and it looks good to me.

On 24 August 2017 at 07:53, John Cremona notifications@github.com wrote:

OK so now on beta there is no collection hmfs.forms (it is renamed as hmfs.forms.bak; I did not use forms.old since that notation comes with an expectation that there is another collection without the '.old' suffix). All HMF pages on beta are working normally (please check). Next step is to push the latest code change to prod; then it will be safe to actually delete the hmfs.forms on prod since hmfs.hecke is already there. I'll do the code push now and check that it's OK on the hour...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

JohnCremona commented 7 years ago

Thanks. I missed 1400 (my time) and any moment now...yes, new code now running on prod. @edgarcosta I'm sure you can more easily than me rename the hmfs.forms collection on the cloud (prod database), say to hmfs.form.bak? If you do that and the pages all still work then that collection can be deleted.

edgarcosta commented 7 years ago

done! see http://www.lmfdb.org/api/

everything seems to work fine.

please, double-check before I delete the collection.

On Thu, 24 Aug 2017 at 08:17 John Cremona notifications@github.com wrote:

Thanks. I missed 1400 (my time) and any moment now...yes, new code now running on prod. @edgarcosta https://github.com/edgarcosta I'm sure you can more easily than me rename the hmfs.forms collection on the cloud (prod database), say to hmfs.form.bak? If you do that and the pages all still work then that collection can be deleted.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/LMFDB/lmfdb-inventory/issues/55#issuecomment-324631951, or mute the thread https://github.com/notifications/unsubscribe-auth/AATtBjqkH6ZMeFL61DdQtkWgeSMUQnsuks5sbXf6gaJpZM4O9KEV .

JohnCremona commented 7 years ago

Yes it looks good to me (even the L-functions). You can delete forms.bak from the prod database. For a while let's keep it on beta though, though it can be reconstructed from the union (in effect) of forms.search and hecke.

edgarcosta commented 7 years ago

Collection dropped.

On Thu, 24 Aug 2017 at 08:40 John Cremona notifications@github.com wrote:

Yes it looks good to me (even the L-functions). You can delete forms.bak from the prod database. For a while let's keep it on beta though, though it can be reconstructed from the union (in effect) of forms.search and hecke.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LMFDB/lmfdb-inventory/issues/55#issuecomment-324638117, or mute the thread https://github.com/notifications/unsubscribe-auth/AATtBklc7b5VgAGnIuYWws7Xyb5okdeCks5sbX1LgaJpZM4O9KEV .

JohnCremona commented 7 years ago

Thanks. I'll comment on an issue in LMFDB/lmfdb so that others notice (e.g. JV & DY).

JohnCremona commented 6 years ago

I think this issue can be closed but have not checked thoroughly.