Open edbennett opened 6 months ago
I'll try to deal with this but in case Niccolo' gets here before I do, my idea is to use json more or less like this:
big_dump = {
"inputs": inputs,
"some_result": the_result.tolist()
}
title_file = str(some_info) + ".json"
with open(title_file, "w") as json_file:
json.dump(big_dump, json_file, indent=4)
so that any result (rho(E), rho(lambda), smearing_kernel(E) ) is printed together with the inputs
I'd suggest trying to keep as much metadata and provenance in the each output file as can reasonably be done.
Things that should be relatively easy to get
If possible, getting the identifiers and attached provenance information of any input data and passing that through would be the gold standard, but that's harder to do when it's not present on the input.
I've only taken a 10,000 foot view so far, but it looks as though some results are only output by the code as text in a log file.
Parsing out data from a free-form log file is annoying and error-prone; I'd recommend that any results that want to be output from a program are output to an appropriate data file format—this might be CSV, JSON, or HDF5. (Probably not HDF5 for the sizes of data here.) Potentially they could still be output with
logging.info
in case anyone wants to keep an eye on a run while debugging.