For every run, there are data files in addition to the input egg files which contain system state information. It would be nice to be able to read those in, provide access to them to processors which may care, and also write them to output trees so that the data are all in one file for downstream analysis.
The snapshot file has the most recent value for every database "endpoint" at the time of the start of the run (includes the timestamp for that entry). The dump file has the timestamp and value for each sensor which happens to be logged during a run. The suggested output would be to make a tree with a timestamp branch and a branch for every endpoint. The 0th entry should have the initial values and should have the start time of the run as the timestamp (warning to users: this timestamp is not the assignment/measurement time). For any logged value after that, a new entry in the tree should be made which retains the value of every branch other than the one which has changed, and the new value for just the single changed branch. This is some repeated data, but is structurally simpler to implement both for construction and use (if you interpolate between times with the same value you clearly get that value). Our starting assumption is that the number of endpoints and number of logged values over the course of a run is small enough that the data inefficiency is not a large concern, at least at first pass.
For every run, there are data files in addition to the input egg files which contain system state information. It would be nice to be able to read those in, provide access to them to processors which may care, and also write them to output trees so that the data are all in one file for downstream analysis.
Deliverables
Notes
The snapshot file has the most recent value for every database "endpoint" at the time of the start of the run (includes the timestamp for that entry). The dump file has the timestamp and value for each sensor which happens to be logged during a run. The suggested output would be to make a tree with a timestamp branch and a branch for every endpoint. The 0th entry should have the initial values and should have the start time of the run as the timestamp (warning to users: this timestamp is not the assignment/measurement time). For any logged value after that, a new entry in the tree should be made which retains the value of every branch other than the one which has changed, and the new value for just the single changed branch. This is some repeated data, but is structurally simpler to implement both for construction and use (if you interpolate between times with the same value you clearly get that value). Our starting assumption is that the number of endpoints and number of logged values over the course of a run is small enough that the data inefficiency is not a large concern, at least at first pass.