Closed amhanson9 closed 2 months ago
@emkaser , my current solution if the header row does not have the expected values is to print an error (like we do if there is no preservation log) and not update the log. You'd need to run validation again after fixing the log.
If you plan to fix the log, I could still add the data from validating to the log (maybe with the standard header row, so it is clear what the data is) as well as printing, so you'd have the validation information when you reformat the log.
Before automatically updating the preservation_log.txt, verify that it has the expected columns. There are legacy files with different names and missing the collection and accession number needed for the new row in the log. Columns are Date, Electronic Media Identifier, Action, Staff
The script is getting the collection and accession number from the first two columns of the last row in preservation_log.txt, so if they aren't what is expected, it puts the wrong information in those columns. And if the new information is a different number of columns, it will cause a ParseError if the log is ever read into pandas again.