Timeline CSV Export needs to be improved

Costigan commented 8 years ago

The timeline export does give start & stop times and durations for all activities. That's good. But the timeline is a tree, and that tree has been shoehorned into a tabular format in a way that's awkward to use.

For instance, it does not directly give power usage for activities. The user would need to follow the activity mode links to find the relevant activities, sum the watts, then multiply by the duration.

Test-a-thon 3/15/16

VWoeltjen commented 8 years ago

I think the next logical increment would be to add columns for resource utilization (power/comms) and fill those in per-Activity. These would be direct costs associated with the Activity (sum of all related Activity Modes) multiplied by the duration of the Activity.

It would still take some work to go from that to a full resource graph like you see in Open MCT/WARP, but the additional steps there involve summing across Activities, which doesn't fit well into the one-object-per-row approach used currently.

@RCarvalhoNASA - does this sound reasonable?

Costigan commented 8 years ago

I suggest we back up and discuss the requirement and use cases before patching the implementation.

Who are the users of the csv export? Scott? What software is used? Excel? Anything else?

I agree that adding a columns for activity resource utilization makes sense.

charlesh88 commented 8 years ago

May need to hear from Robert is this is a blocker issue for Gibson.

RCarvalhoNASA commented 8 years ago

Not a blocker for any build. Would be a good to have if we cannot transition the data I'm building in the persistent version to a newer version, but negotiable. @VWoeltjen , your approach for handling this seems perfectly reasonable.

akhenry commented 8 years ago

Just wanted to add comments from @shobart on another ticket -

Also, it seems to automatically download to the Download directory. Would be good to give the user the option to select the location of where they want to download it.

Costigan commented 8 years ago

Here are some thoughts about requirements.

Warp shall export plans for safekeeping in case something happens to the server
Warp shall import plans that have been exported for safekeeping
Warp's plan safekeeping format shall preserve objects and object relationships

I think these should be implemented via a pretty-printed json export.

The feature that satisfies this doesn't need to be limited to plans, but it can't involve exporting the entire database.

I don't know whether the third requirement should mean that object identity must be preserved over export/import or only that the network of objects imported be isomorphic to the original object network. I think guaranteeing preservation of object identity is too hard, given what might happen to the rest of the database between export and import. I don't think that the identity of individual activities has to be preserved, but I do think that other objects referenced by activities, e.g., subsystems or telemetry points, should be. So we probably should talk. Certainly the internal structure of the plan must be preserved.

Warp shall export plans as CSV files suitable for external tools to analyze plan resource usage

This requirement explicitly states that the csv files are for running the numbers on resource usage. The csv export does not have to preserve all of the structure of the plan unless that's needed for the kind of analysis mentioned. Any external tools that need to know everything about the plan should use the json export. This lets you design the csv export to simplify use in tools like excel.

Actually, to put a fine point on it, I would like us to design this export so that it can be used easily in excel or google doc's equivalent. Not that it's possible, but that it can be used easily. I'm struggling with that. I'm completely open to suggestions on how to achieve that. The problem is the activity hierarchy.

Ideally, the csv would have a fixed set of columns that are always present. For a timeline-oriented tool like LASS, it's this way. There is no activity hierarchy. (Hierarchy is implemented by a separate concept called an Activity Group, and they can often be ignored.) Each activity exists on exactly one timeline. And there's a set of resource types that are associated with all activities. The set of resource types is determined by the activity dictionary which is usually fixed per mission. So it's not unreasonable to slightly tune the export code for that fixed set of resource types. 'Timeline' becomes a column you can filter on, like 'Name' and 'Duration'. The resources become a fixed set of columns.

Resources in warp plans can be handled similarly. It's the activity hierarchy that I'm unclear on. If we were exporting an excel file rather than csv, then the natural thing would be emit rows for all leaf activities with numeric values in the resource columns and emit rows for non-leaf activities with equations in the resource columns that sum the resources of their children. We can do the same thing with csv, just with the sums rather than the equations, but the user has to be careful not to double count things.

Here's an option that isn't perfect. First, lose the columns that encode the tree in the current implementation. Also, drop the column with the cryptic unique id's.

Then, always export the tree in parent-first, left-to-right order. Add a column that indicates the depth of the node at that row. Finally, add a column that indicates whether the node is a leaf node. For many analyses, just filtering to the leaf nodes and then summing some resource usage would be fine. For productivity, you'd filter to specific, known activity names and then sum durations. With the two added columns, one could create a computed column that flags nodes that are at or above some depth, which could be used for the rest. And the tree is still fully encoded, but with a fixed set of columns (as long as the resource columns are fixed).

nasa / openmct

Timeline CSV Export needs to be improved #751