christopherwharrop / rocoto

Rocoto Workflow Management System
Apache License 2.0
21 stars 16 forks source link

rocotostat/rocotocheck xml output #21

Open samtrahan opened 6 years ago

samtrahan commented 6 years ago

Rocoto state viewing commands do not provide an option to generate their output in XML format so that it is easily parsed by downstream tools.

christopherwharrop commented 6 years ago

Adding a -x flag to rocotostat and rocotocheck commands to request output in XML format is a good idea.

christopherwharrop commented 6 years ago

Lets discuss the XML structure to make sure it is extensible without modifying existing behavior in case new things need to be added to it later. Otherwise, downstream tools that use it will break when we change it.

samtrahan commented 6 years ago

Chris,

I'll send you some ideas shortly. The goal is to make an XML form of the tables that rocotostat dumps.

As a last resort, in the future, if we need to make backward-incompatible changes to the XML, we could specify a version.

rocotostat ... --xml-version 1.3 # get Rocoto 1.3's XML output style

samtrahan commented 6 years ago

The main use case is for an automated tool to either display, or act on, the status of a workflow. The expected structure is the same as the workflow document. However, this will only be parsed by software, not humans, so the syntax should be expressive rather than compact. I suggest we put in every piece of information that is readily available, but not likely to change significantly from version to version:

<workflow_status>
  <metatask>
    <task name="my_metatask_A">
      <state>RUNNING</state>
      <join>/some/path/to/a/file</join>
      <job>
        <state>FAILED</state>
        <submitted_at>1532961111</submitted_at>
        <native_state>EXIT</native_state>
        <exit_status>13</exit_status>
        <runtime>1803</runtime>
        ...
      </job>
      <job>
        <state>RUNNING</state>
        <submitted_at>1532963333</submitted_at>
        <native_state>ACTIVE</native_state>
        ...
      </job>
    </task>
    <task name="my_metatask_B">
      ...
    </task>
    <task name="my_metatask_C">
      ...
    </task>
    <task name="my_metatask_D">
      ...
    </task>
  </metatask>
  <task name="my_final_task" final="T">
    <state>INACTIVE</state>
    <join>/some/path/to/a/file</join>
    ...
  </task>
</workflow_status>
christopherwharrop commented 6 years ago

That looks like a reasonable proposal to start from.

samtrahan commented 6 years ago

I'm also considering having it be a new program. The rocotostat is complex enough as it is. I was thinking of something like "rocotodump." If I were coming into the system from outside as a new developer, it would sound like a program to dump the state in a non-human-readable form.