tarsqi / ttk

Tarsqi Toolkit
Apache License 2.0
25 stars 10 forks source link

Add processing history to metadata #14

Closed marcverhagen closed 3 years ago

marcverhagen commented 8 years ago

With the processing history added it would be possible to look at a file and see how it came into being. This should probably be a list inside of the metadata element of the ttk file, storing the name of the component, date and perhaps other information like the version of the software or a git commit.

<metadata>
   <dct value="20151231"/>
   <processing_steps>
      <step component="PREPROCESSOR" date="20151231" git_commit="fp327gh"/>
      <step component="GUTIME,EVITA" date="20160101" git_commit="fp327gh"/>
   </processing_steps>
</metadata>

This history shows that on the last day of 2015 we ran the preprocessor and the next day we ran a pipeline with GUTime and Evita, in both cases using the same code base. Other attributes could be added as needed. It might be useful to split the second step into two separate steps and it might be useful to use a timestamp instead of a date.

<metadata>
   <dct value="20151231"/>
   <processing_steps>
      <step component="PREPROCESSOR" date="20151231" git_commit="fp327gh"/>
      <step component="GUTIME" date="20160101:171203" git_commit="fp327gh"/>
      <step component="EVITA" date="20160101:171205" git_commit="fp327gh"/>
   </processing_steps>
</metadata>

This may relate to issue https://github.com/tarsqi/ttk/issues/3 on adding views. You could imagine each tag having a step attribute which contains the component that added the tag.

marcverhagen commented 7 years ago

This will also promote good hygiene because with the history it will be easy to check whether a component has already applied and that check is useful because allowing a component to apply twice is error-prone and gives head scratching results until you realize where you erred.

With processing history added to the metadata dictionary, it probably makes sense to replace the dictionary with a Metadata class.