catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
456 stars 106 forks source link

Re-running PUDL initialization w/ default settings deletes DB #323

Closed zaneselvans closed 4 years ago

zaneselvans commented 5 years ago

Describe the bug If one already has a database initialized, and then runs

init_pudl.py settings_init_pudl_default.yml

The pre-existing database is deleted and replaced with nothing... which is probably not what the user wants.

Expected behavior If the user has a database already, and tries to initialize with nothing, probably we should not touch the existing database, and output a big obvious message.

cmgosnell commented 5 years ago

how do you think this should translate to the new packaging arrangement?

zaneselvans commented 5 years ago

Given that we're going to dump the DB here shortly, probably it doesn't translate directly, but I wanted to remain aware of it. And maybe more generally, we should try and avoid having anything that people are likely to do without thinking too hard be something that can wipe out all of their data.

zaneselvans commented 4 years ago

No longer directly relevant. Implementing clobber check for FERC 1 SQLite DB in #408 and have a clobber check for the data packages. Should also have a clobber check for the datapkg_to_sqlite1 script though...