HEPData / hepdata_lib

Library for getting your data into HEPData
https://hepdata-lib.readthedocs.io
MIT License
15 stars 39 forks source link

Submission.create_files: Set remove_old default to False #193

Closed AndreasAlbert closed 2 years ago

AndreasAlbert commented 2 years ago

Closes #192

clelange commented 2 years ago

Thanks for the quick fix! I think we should take this a bit further and actually prohibit having the directory, in which the python script resides, be the same as the output directory (and the script directory must also not be a subdirectory of the output directory). This should protect us from accidental file deletion. What do you think?

AndreasAlbert commented 2 years ago

I support making this overall more robust. However, I think that tacking on individual checks for specific edge cases might not be the best approach as we might still miss something. How about instead, we generally force the user to use a previously nonexistant directory the first time they execute create_files. That would require us to preserve between runs the knowledge of whether or not a given directory was originally created by hepdata_lib. We could easily accomplish this by depositing an empty signifier file (e.g. $DIRECTORY/.created_by_hepdata_lib) in the desired directory. Each time it is run, create_files would check whether the output directory exists already and whether the signifier file exists. If the directory does not yet exist, we proceed as normal with creating the directory as well as the signifier file. If the directory exists, but the file does not, we exit and give the user a warning telling them that they should use a dedicated empty directory in order to avoid trouble.

Right this moment, though, we have code published on pypi that can accidentally wipe user files with default settings. Therefore, let's please merge this hot fix and mint a new version. That buys us a little time to think through how to really fix this once and for all.