Open mccabete opened 7 years ago
@tonygardella What DB errors do we tend to get?
Let's make an editable list! I'll start a checklist here -- feel free to edit this comment (pencil button in the top right of the issue) to add stuff, or check stuff off that has been implemented.
dbfiles
records -- i.e. dbfiles
records with no input
, model
, etc. records input
records -- i.e. input
records that don't have corresponding dbfiles
recordsformats
records that don't have any matching dbfiles
@ankurdesai suggestion -- Function for a given site and met product that would purge everything except original data download.
Mike- A function that takes away old runs, and especially met made by old runs
It use to be that failed downloads of raw met data created a db record even though actual files are not there. For example if a site only had 2000- 2010 and you ran 1998-2002 it would update the input record to say 1998-2002 was there and only download 2000-2002. No sure how to fix this mismatch other than to delete old runs and their associated file as much as possible.
failed downloads of raw met data created a db record even though actual files are not there
Input records, especially met records, should have a start date, end date, and format specification, right? Perhaps we can use the format record and load_data
functionality to get the actual bounds of the data, compare them to the start and end date, and if there is a mismatch, then flag the file for deletion? Similarly, would it be too aggressive to delete any input file that is missing any of these specifications? I.e. Are there circumstances where it's OK for an input to be missing a start date, end date, or format?
@ashiklom soil datasets often do not have a start and end date
@ashiklom I think that would work. We could update the records to reflect the data that is actually there or delete them. I think for this it would be best to just use the met format records.
A checklist of functions to create. Also available here in a google doc.
A quick and dirty synthesis of github comments/issues turned into a list of functions:
Mass Execution functions
Checks
Find DB Entry Functions
Find Run Functions
This issue is stale because it has been open 365 days with no activity.
It would be good to have a script or set of functions or both that round up records that need to be fixed. This could apply to:
files without input records, inputs with no file records, file-formats without files, dbfiles records that point to files that don’t exist etc.
This could just be some queries that puts records into a generic readable file that could be hand edited, and then re-fed into a function that would cull records that remain.