wanunulab / ioniq

GNU General Public License v3.0
0 stars 0 forks source link

analysis pipeline logging and loading from logs (serialization of io metadata, filters, and parsing, metadata) #16

Open a-fa opened 1 month ago

a-fa commented 1 month ago

We need a format of storing workflow. What does that mean? Let's say we have taken a datafile, read it, filtered it, parsed it into n different ranks, and arrived at a final analysis. the goal is to store all the steps taken up to that point, but with no duplicates or undos. the simplest way to do this is to pickle the used instance of the io classes, filters, parsers, etc. and reload them later. but that is a problem because it won't be future proof (newer versions of the package cannot load and operate on old pickles), and it won't be easily transferred between systems and people. Instead, we should be able to store the parameters that went into constructing those instances and store them, and later reconstruct those elements based on the stored parameters. Ideally, this stored format should be human readable.

dinboyko commented 1 month ago

do we need just a human-readable log file or one we can parse?

a-fa commented 1 month ago

both. for example

[{"io" : {
  "class": "EDHReader", 
  "method": {"name":"read", "params" : {"filename": "wherever/whatever.edh", "voltage_compress": true}
  }},

 {"filter":{"class":"butterworth",
    "params":{"cutoff":"20000", "order":4},

 {"parser": {
  "class": "SpikeParser", 
  "params": {"height":100e-9, "distance" : 0.02}
  }},
 {"parser":{
   "class":"SpeedyStatSplit",
   "params": ......
}
...
dinboyko commented 1 month ago

ok, great. I use the logging API and JSON format for storing. I just need to figure out how to add the logs to the file so it doesn't lose the JSON structure. It turned out that if you use the "a" mode in with open(file) it introduces an error that keeps JSON readable but makes it not parseable

a-fa commented 1 month ago

keep in mind that logging every action taken isn't necessarily right, sometimes people try a bunch of parameters for something and in addition to logging all actions, it's good to track the one path to a final result and store that along with analysis results, whatever it may be. this is the tricky part

dinboyko commented 2 weeks ago

TODO: start writing to a new file for a new analysis