zooniverse / planet-four

Identify and measure features on the surface of Mars
https://www.planetfour.org/
Apache License 2.0
2 stars 0 forks source link

Mongo dump incomplete? #135

Closed michaelaye closed 9 years ago

michaelaye commented 9 years ago

According to the documents of the recent MongoDB tutorial one would need a bson file to import the Mongo dump back into a Mongo database. But the Mongo daily dump for P4 only contains

planet_four_classifications.json planet_four_groups.json planet_four_subjects.json

Or is it that these are not meant to be reimported to a Mondo DB?

parrish commented 9 years ago

That would be very strange. I just grabbed the last backup:

$ ls
planet_four_2015-04-12.tar.gz

$ tar xzvf planet_four_2015-04-12.tar.gz
x planet_four_2015-04-12/
x planet_four_2015-04-12/planet_four_classifications.bson
x planet_four_2015-04-12/planet_four_classifications.metadata.json
x planet_four_2015-04-12/planet_four_users.bson
x planet_four_2015-04-12/planet_four_subjects.metadata.json
x planet_four_2015-04-12/planet_four_users.metadata.json
x planet_four_2015-04-12/planet_four_subjects.bson

The .bson files contain the data and the .metadata.json files contain information about the collection (like indexes).

michaelaye commented 9 years ago

This must have been recently fixed. @chrissnyder sent me a fixed link on March 31. See my extracts here:

klay6683 at macd2860 in ~/Downloads
$ for f in `ls saniti*`; do echo $f; done
sanitized_planet_four_2015-02-10.tar
sanitized_planet_four_2015-02-14.tar
sanitized_planet_four_2015-03-11.tar
sanitized_planet_four_2015-03-26.tar
sanitized_planet_four_2015-03-29.tar
sanitized_planet_four_2015-03-30.tar

klay6683 at macd2860 in ~/Downloads
$ for f in `ls saniti*`; do tar xvf $f; done
x sanitized_planet_four_2015-02-10/
x sanitized_planet_four_2015-02-10/planet_four_subjects.json
x sanitized_planet_four_2015-02-10/planet_four_classifications.json
x sanitized_planet_four_2015-02-10/planet_four_groups.json
x sanitized_planet_four_2015-02-14/
x sanitized_planet_four_2015-02-14/planet_four_subjects.json
x sanitized_planet_four_2015-02-14/planet_four_classifications.json
x sanitized_planet_four_2015-02-14/planet_four_groups.json
x sanitized_planet_four_2015-03-11/
x sanitized_planet_four_2015-03-11/planet_four_subjects.json
x sanitized_planet_four_2015-03-11/planet_four_classifications.json
x sanitized_planet_four_2015-03-11/planet_four_groups.json
x sanitized_planet_four_2015-03-26/
x sanitized_planet_four_2015-03-26/planet_four_subjects.json
x sanitized_planet_four_2015-03-26/planet_four_classifications.json
x sanitized_planet_four_2015-03-26/planet_four_groups.json
x sanitized_planet_four_2015-03-29/
x sanitized_planet_four_2015-03-29/planet_four_subjects.json
x sanitized_planet_four_2015-03-29/planet_four_classifications.json
x sanitized_planet_four_2015-03-29/planet_four_groups.json
x sanitized_planet_four_2015-03-30/
x sanitized_planet_four_2015-03-30/planet_four_subjects.json
x sanitized_planet_four_2015-03-30/planet_four_classifications.json
x sanitized_planet_four_2015-03-30/planet_four_groups.json
parrish commented 9 years ago

That's bizarre. Please reopen an issue and @ mention me on it if it happens again

michaelaye commented 9 years ago

just happened again.

https://www.dropbox.com/s/j4yknex1r28bjss/Screenshot%202015-04-13%2015.18.30.png?dl=0

I double-checked if there's an issue with the Safari downloading process vs Chrome, but both downloads look the same.

parrish commented 9 years ago

Ah. I hadn't realized you were getting the sanitized backup (just stripped of unnecessary user information).

The difference there is that those are generated with mongoexport and can be imported with mongoimport.

You can import them with something like

mongoimport --db ouroboros --collection planet_four_subjects --file sanitized_planet_four_2015-04-14/planet_four_subjects.json
mongoimport --db ouroboros --collection planet_four_classifications --file sanitized_planet_four_2015-04-14/planet_four_classifications.json

You may need to add arguments for --host, --username, --password, etc. depending on your setup.