Open jacksund opened 1 year ago
I've used this in the past for exporting & importing databases, and I run an automated dump every night for backup.
Anything you're planning to implement here?
The plan is to...
simmate database reset
command offer to download and load the dump fileIf you're interested in sharing the lab's calculation results with others, we can eventually add some warren lab data to the prebuilds too
I'm not sure if this will be faster or slower than the load-remote-archives
command, so there's a chance I scrap this feature too
It takes 4-12 hours to grab all the data for postgres when loading all archives, in my experience. I presume the bottleneck is that the cdn is rate limited.
I believe our database with all matproj, jarvis, cod, + oqmd is 4 GB for reference. Not sure how it compares to the original data that is stored at the cdn.
I presume the bottleneck is that the cdn is rate limited.
Downloading from the CDN is actually really quick and only takes a few minutes with UNC's crazy internet speeds. The slow part is then taking that CSV data and then saving it your postgres database. Right now, thebottleneck is recalculating the MatProj hull energies for all systems, so I need to cache these. Once cached, I bet the load-remote-archives
command will only take ~1-2 hrs.
Not sure how it compares to the original data that is stored at the cdn.
you can look at the files in ~/simmate/sqlite-prebuilds/
to see what's stored in the cdn. These are really just CSV files compressed into a ZIP and I think they're normally ~1-2GB
Describe the desired feature
just like how sqlite3 has prebuilds, we can do the same with postgres dump files: https://www.postgresql.org/docs/8.1/backup.html
Additional context
No response
To-do items
No response