Tree component must correctly render families

kinow commented 5 years ago

@hjoliver reported on riot families are not properly displayed with the current Tree Component. Adding as a tentative for the alpha release.

kinow commented 5 years ago

Quickly hacked five to have a BAR family with two children bar1 and bar2:

[meta]
    title = "Inter-cycle dependence + a cold-start task"
[cylc]
    UTC mode = True
[scheduling]
    #runahead limit = 120
    initial cycle point = 20130808T00
    final cycle point = 20150810T00
    [[dependencies]]
        [[[R1]]]
            graph = "prep => foo"
        [[[PT12H]]]
            graph = "foo[-PT12H] => foo => BAR"

[runtime]
[[root]]
script = "sleep 2; echo Hi!"

[[BAR]]
script = "sleep 2; echo Family!"

[[bar1]]
inherit = "BAR"

[[bar2]]
inherit = "BAR"

Here's the result:

The children are displayed. The family is never displayed in the tree.

kinow commented 5 years ago

Here's the GraphQL query executed:

{
  workflows(ids: ["five"]) {
    id
    name
    status
    owner
    host
    port
    taskProxies {
      id
      state
      cyclePoint
      task {
        meanElapsedTime
        name
      }
      jobs {
        id
        host
        startedTime
        state
        submitNum
      }
    }
  }
}

And the response:

{
  "data": {
    "workflows": [
      {
        "id": "kinow|five",
        "name": "five",
        "status": "running",
        "owner": "kinow",
        "host": "ranma",
        "port": 43060,
        "taskProxies": [
          {
            "id": "kinow|five|20130829T0000Z|bar2",
            "state": "succeeded",
            "cyclePoint": "20130829T0000Z",
            "task": {
              "meanElapsedTime": 1.899999976158142,
              "name": "bar2"
            },
            "jobs": [
              {
                "id": "kinow|five|20130829T0000Z|bar2|1",
                "host": "localhost",
                "startedTime": "2019-09-16T05:51:23Z",
                "state": "succeeded",
                "submitNum": 1
              }
            ]
          },
          {
            "id": "kinow|five|20130829T1200Z|bar2",
            "state": "submitted",
            "cyclePoint": "20130829T1200Z",
            "task": {
              "meanElapsedTime": 1.899999976158142,
              "name": "bar2"
            },
            "jobs": [
              {
                "id": "kinow|five|20130829T1200Z|bar2|1",
                "host": "localhost",
                "startedTime": "",
                "state": "submitted",
                "submitNum": 1
              }
            ]
          },
          {
            "id": "kinow|five|20130829T0000Z|bar1",
            "state": "succeeded",
            "cyclePoint": "20130829T0000Z",
            "task": {
              "meanElapsedTime": 1.899999976158142,
              "name": "bar1"
            },
            "jobs": [
              {
                "id": "kinow|five|20130829T0000Z|bar1|1",
                "host": "localhost",
                "startedTime": "2019-09-16T05:51:23Z",
                "state": "succeeded",
                "submitNum": 1
              }
            ]
          },
          {
            "id": "kinow|five|20130829T1200Z|foo",
            "state": "succeeded",
            "cyclePoint": "20130829T1200Z",
            "task": {
              "meanElapsedTime": 1.899999976158142,
              "name": "foo"
            },
            "jobs": [
              {
                "id": "kinow|five|20130829T1200Z|foo|1",
                "host": "localhost",
                "startedTime": "2019-09-16T05:51:23Z",
                "state": "succeeded",
                "submitNum": 1
              }
            ]
          },
          {
            "id": "kinow|five|20130829T0000Z|foo",
            "state": "succeeded",
            "cyclePoint": "20130829T0000Z",
            "task": {
              "meanElapsedTime": 1.899999976158142,
              "name": "foo"
            },
            "jobs": [
              {
                "id": "kinow|five|20130829T0000Z|foo|1",
                "host": "localhost",
                "startedTime": "2019-09-16T05:51:18Z",
                "state": "succeeded",
                "submitNum": 1
              }
            ]
          },
          {
            "id": "kinow|five|20130830T0000Z|foo",
            "state": "submitted",
            "cyclePoint": "20130830T0000Z",
            "task": {
              "meanElapsedTime": 1.899999976158142,
              "name": "foo"
            },
            "jobs": [
              {
                "id": "kinow|five|20130830T0000Z|foo|1",
                "host": "localhost",
                "startedTime": "",
                "state": "submitted",
                "submitNum": 1
              }
            ]
          },
          {
            "id": "kinow|five|20130829T1200Z|bar1",
            "state": "submitted",
            "cyclePoint": "20130829T1200Z",
            "task": {
              "meanElapsedTime": 1.899999976158142,
              "name": "bar1"
            },
            "jobs": [
              {
                "id": "kinow|five|20130829T1200Z|bar1|1",
                "host": "localhost",
                "startedTime": "",
                "state": "submitted",
                "submitNum": 1
              }
            ]
          }
        ]
      }
    ]
  }
}

So we do not have families in the response yet. My guess is that we will have to change the query, to something like:

{
  workflows(ids: ["five"]) {
    id
    name
    status
    owner
    host
    port
    familyProxies {
      id
      name
      state
      childTasks {
        id
        state
        cyclePoint
        task {
          meanElapsedTime
          name
        }
        jobs {
          id
          host
          startedTime
          state
          submitNum
        }
      }
    }
  }
}

But probably need to confirm with @dwsutherland if this query makes sense (e.g. I am excluding familyProxies / family because I only need the name for the tree, and it seems like the proxy/family share the same name?). Or if there is a better alternative to retrieve workflow + family + task + job information.

hjoliver commented 5 years ago

@kinow - I'm pretty sure @dwsutherland said we have the flexibility to get a nested tree back via GraphQL by just structuring the query the right way, although I couldn't pull that query out of the hat myself yet.

However, that conflicts with our shared data store concept: flat list of nodes and edges, with nodes used by all views and nodes+edges used by the graph view. If we do that we'll have to compute the family tree structure in the UI, for the tree view, by using the abstract tree info.

I guess the good news is GraphQL gives us the power to go either way, but the bad news (for you at the moment :-) is we decided to go with common data store for optimized network transfer (and to avoid cross-view duplication) at least as a first cut.

Showing the tree structure isn't a blocker for alpha-1 relase IMO, unless it's easy (after getting the leaf data in first).

@oliver-sanders - does this conform with your understanding?

kinow commented 5 years ago

Showing the tree structure isn't a blocker for alpha-1 relase IMO, unless it's easy (after getting the leaf data in first).

Much better that way. And work on the leaf data starting tomorrow morning! :steam_locomotive:

dwsutherland commented 5 years ago

@kinow - I'm pretty sure @dwsutherland said we have the flexibility to get a nested tree back via GraphQL by just structuring the query the right way, although I couldn't pull that query out of the hat myself yet.

Was something like this:

fragment treeNest on FamilyProxy {
  name
  cyclePoint
  state
  depth
  childTasks(ids: $nIds, states: $nStates, mindepth: $minDepth, maxdepth: $maxDepth) {
    id
    state
    latestMessage
    depth
    jobs {
      id
      host
      batchSysName
      batchSysJobId
      submittedTime
      startedTime
      finishedTime
      submitNum
    }
  }
}

query tree($wIds: [ID], $nIds: [ID], $nStates: [String], $minDepth: Int, $maxDepth: Int) {
  workflows(ids: $wIds) {
    id
    name
    status
    stateTotals
    treeDepth
  }
  familyProxies(workflows: $wIds, ids: ["root"]) {
    ...treeNest
    childFamilies(mindepth: $minDepth, maxdepth: $maxDepth) {
      ...treeNest
      childFamilies(mindepth: $minDepth, maxdepth: $maxDepth) {
        ...treeNest
        childFamilies(mindepth: $minDepth, maxdepth: $maxDepth) {
          ...treeNest
          childFamilies(mindepth: $minDepth, maxdepth: $maxDepth) {
            ...treeNest
          }
        }
      }
    }
  }
}

{
  "wIds": ["baz"],
  "nIds": ["20*|*"],
  "nStates": ["succeeded", "waiting", "held"],
  "minDepth": 0,
  "maxDepth": 4 
}

However, that conflicts with our shared data store concept: flat list of nodes and edges, with nodes used by all views and nodes+edges used by the graph view. If we do that we'll have to compute the family tree structure in the UI, for the tree view, by using the abstract tree info.

I think we may need to revisit this one... Because we have to consider:

Pagination of queried data (for infinite scroll), wouldn't work across views (as the standard way of doing this is via the query/server-side/data-driven)... Although it may not be suitable for data as dynamic as ours.
Different views have different requirements on the fields and structure of the data

Having said that.. you could change the above inheritance structure to just return the IDs, and then fill it in from the flat list of family and task nodes:

fragment treeNest on FamilyProxy {
  id
  childTasks(ids: $nIds, states: $nStates, mindepth: $minDepth, maxdepth: $maxDepth) {
    id
    jobs {
      id
    }
  }
}

query tree($wIds: [ID], $nIds: [ID], $nStates: [String], $minDepth: Int, $maxDepth: Int) {
  workflows(ids: $wIds) {
    id
  }
  familyProxies(workflows: $wIds, ids: ["root"]) {
    ...treeNest
    childFamilies(mindepth: $minDepth, maxdepth: $maxDepth) {
      ...treeNest
      childFamilies(mindepth: $minDepth, maxdepth: $maxDepth) {
        ...treeNest
        childFamilies(mindepth: $minDepth, maxdepth: $maxDepth) {
          ...treeNest
          childFamilies(mindepth: $minDepth, maxdepth: $maxDepth) {
            ...treeNest
          }
        }
      }
    }
  }
}

{
  "wIds": ["baz"],
  "nIds": ["20*|*"],
  "nStates": ["succeeded", "waiting", "held"],
  "minDepth": 0,
  "maxDepth": 4 
}

kinow commented 5 years ago

Thanks @dwsutherland ! Looks like it may be doable, but can't guarantee it would be doable for the alpha release (cc @hjoliver, really really unlikely to include this one I think).

kinow commented 5 years ago

Moved to 0.2

hjoliver commented 5 years ago

@dwsutherland

However, that conflicts with our shared data store concept: flat list of nodes and edges, with nodes used by all views and nodes+edges used by the graph view. If we do that we'll have to compute the family tree structure in the UI, for the tree view, by using the abstract tree info.

I think we may need to revisit this one... Because we have to consider: [pagination and different view requirements...]

Agreed, we need to put some thought into this to figure out which way to go.

kinow commented 5 years ago

My task for the next week. Will try first to fix the issue to support variables in queries. Then work on this one :+1:

hjoliver commented 5 years ago

@oliver-sanders - can you comment on the UI data-store implications of task family tree structure, as raised above? Do we request nested data and duplicate the data (if non-nested views are also active) or - as planned - share flat data and compute the tree in the UI?

oliver-sanders commented 5 years ago

can you comment on the UI data-store implications of task family tree structure, as raised above

I think this undermines the central data store pretty badly. There would still be a large amount of overlap between subscriptions (as each view wants pretty much the same data) but it would be very hard to compute the overlap and merge the subscriptions accordingly. Consequences:

Lower compute load on the UI (as the data arrives in the desired structure reducing lookups).
Higher memory footprint for the UI (due to storing duplicate data in different structures).
Higher load on the UI Server (due to serving multiple overlapping subscriptions).
Greater network usage (due to pushing more data down the pipe).

There's a tradeoff here, I originally envisaged a flat data store as this makes memory and network nice and efficient, keeping the load away from the workflow and UI Server. The tradeoff is that the UI has to do a bit of cross-referencing to get things to work. My logic was that the memory and network arguments were much more important and that whilst cross-referencing between different flat structures might be sub-optimal, it only has to be done on a case by case basis when data actually changes (the joy of a responsive, data-driven system). So maybe the first load takes a little bit longer, but from then on in, the performance cost aught to be minimal.

So personally I would much rather have a central data store with everything in a simple standard data structure with all of the views providing hooks to respond to data structure changes. This is more of a "watch" pattern which doesn't seem to fit perfectly with Vue (maybe we need to look into observables or whatever they're called).

tldr;

If looking up the family structure for each task is proving too computationally expensive, could we not just request the list of linearised ancestors for each task? This information is known by the workflow and should be cheep to pass through.

hjoliver commented 5 years ago

@kinow - Oliver has explained the situation nicely there.

We don't have any evidence yet that the extra UI computation (computing the family tree client-side) will be a problem, and we can always change tack later if it turns out that it is a problem (but I'm skeptical that it will be bad compared to other client-side computation that we can't avoid anyway).

So on that basis we will stick to the original plan: the UI should request flat data for easy sharing between views (and minimal memory footprint) and compute the family tree client-side.

@MartinRyan - this will affect the graph view too in some sense. Even though you don't want the dependency data in tree form, we need family collapse in the graph. (And that has to be computed client-side, unlike for Cylc-7).

The old UI computes the family tree from:

flat list of concrete task proxies (concrete: for specific cycle points)
abstract family tree structure (abstract: based on family and task names only).

In the new UI we can either:

do the same as for the old UI
OR (as Oliver suggest above I think) each task could hold its own list of linearized ancestors

The new way (OR) might be easier...

kinow commented 5 years ago

We don't have any evidence yet that the extra UI computation (computing the family tree client-side) will be a problem

https://github.com/cylc/cylc-ui/issues/227#issuecomment-534300509

The scripting part is a bit worrying, but at this moment I am chasing memory leaks.

But my impression is that for the complex suite, with polling (! might get better when websockets/data updates is used), the UI performance downgrades significantly because of the re-computing of data, and consequent update re-render of DOM elements. This, combined with several SVG elements being rendered too.

The memory leaks are hindering the performance too. But once they are fixed, until we have websockets working with data-driven updates, users might struggle to run complex workflow with the tree component.

we can always change tack later if it turns out that it is a problem (but I'm skeptical that it will be bad compared to other client-side computation that we can't avoid anyway).

👍

So on that basis we will stick to the original plan: the UI should request flat data for easy sharing between views (and minimal memory footprint) and compute the family tree client-side.

Alright. The query that I was going to use was based on the example with fragments and nested levels provided by @dwsutherland a few comments above ☝️ . Does that mean that actually we will have another query that returns a single level with all information about tasks / families / jobs / etc ? With no hierarchy?

kinow commented 5 years ago

(obviously, the performance problems above could all be due to bugs in the tree component too, or some bad design in the tree component 😬 )

hjoliver commented 5 years ago

@kinow - I assumed your "complex suite" + treeview problem is a memory leak (as I think you suggested?) as it grew over time.

But once they are fixed, until we have websockets working with data-driven updates

I'm not worried about that - we will definitely have websockets and data-driven updates before anyone needs to run real suites.

kinow commented 5 years ago

Not only memory leaks. Even before we had the tree component, when we were using the vue-ads-table-tree component, it was already impossible to run the complex suite.

The five suite worked fine for me, but during the development of the tree component I wanted to make sure I didn't miss any possible bug, and decided to look at performance with the complex suite.

That showed that we had memory leaks, and that the complex suite was still not usable with the cylc web UI.

hjoliver commented 5 years ago

the UI performance downgrades significantly because of the re-computing of data, and consequent update re-render of DOM elements.

I don't have a good feel for what gets recomputed and what gets rerendered in the new UI, and what we can expect peformance-wise compared to native Python + PyGTK.

Once we have all the tree view basics in place, we can start looking at performance.

Note that the proposed "n-distance" changes will help by greatly reducing the number of waiting and succeeded tasks that the UI has to display (relative the the complete task pool as shown now).

kinow commented 5 years ago

Once we have all the tree view basics in place, we can start looking at performance.

👍

Note that the proposed "n-distance" changes will help by greatly reducing the number of waiting and succeeded tasks that the UI has to display (relative the the complete task pool as shown now).

👍

Updating the tree view and component shouldn't be too hard if necessary. I just need to understand what will change and where, and see where I can help (happy to dive into the GraphQL backend again if necessary)

hjoliver commented 5 years ago

Not only memory leaks ...

The way I see it, we've only just reached the point we have all components of the new architecture working together, and I'm not expecting great performance yet. Having got to this point though, particularly once the tree view has all basic functionality, we do need to start thinking about performance.

dwsutherland commented 5 years ago

I don't see why we can't do both...

The nested one giving the structure (only ids returned) with query/subscription (which only updates when a cycle of ids is added or removed)
A flat list of nodes and flat list of edges (with all fields required for all view)

You can't dribble in updates very easily with only a nested structure (because you'd need to send the structure with it each time, since the UI would need to know where in the graph/tree structure to update).. But a flat list makes it easy to update just the nodes..

The data-driven (nested only) approach would be efficient on the UI if your view is limited, you only need a fraction of the full data set, which means updates and the amount of data held by the UI is small..

And yes, the task and family proxies have parent and children fields..

hjoliver commented 5 years ago

I don't see why we can't do both...

The nested one giving the structure (only ids returned) with query/subscription (which only updates when a cycle of ids is added or removed)

That's a good point. Somewhat analogous to the graph edges (structural information, without all the data associated with each node). For a huge suite it might still be a signficant memory hit though, in which case it may still be better to avoid it if the tree computation can be done client-side very easily and quickly. Worth thinking about though - ping @oliver-sanders

hjoliver commented 5 years ago

And yes, the task and family proxies have parent and children fields..

Is that the full list (back to root), not just immediate parents and children?

dwsutherland commented 5 years ago

And yes, the task and family proxies have parent and children fields..

Is that the full list (back to root), not just immediate parents and children?

I'll just check... other question is whether it's in order (if a flat list)

hjoliver commented 5 years ago

Even before we had the tree component, when we were using the vue-ads-table-tree component, it was already impossible to run the complex suite.

(for the record, via Riot chat, the current treeview handles the complex suite pretty well so long as you don't allow too many jobs to run at once on the workflow host)

dwsutherland commented 5 years ago

Given an inheritance:

[runtime]
    [[root]]
        init-script = echo "I'm first"
        env-script = echo "Hi first, I'm second"
        script = sleep 5; echo "RubyRubyRubyRuby"
        exit-script = echo 'Yay!'
        err-script = echo 'Boo!'
    [[FAM]]
    [[FAM2]]
        inherit = FAM
    [[FAM3]]
        inherit = FAM2
    [[foo]]
        inherit = FAM3
    [[bar]]
        inherit = FAM3
    [[FAM4]]
    [[FAM5]]
    [[FAM6]]
        inherit = FAM5
    [[qux]]
        inherit = FAM6
    [[qaz]]
        inherit = FAM4, FAM
    [[qar]]
        inherit = FAM, FAM4, FAM5

You can gather the inheritance information via the fields:

query {
  tasks(workflows: ["baz"], ids: ["foo"]) {
    id
    namespace
    depth
  }
  taskProxies(workflows: ["baz"], ids: ["foo", "qar"]) {
    id
    parents (sort: {keys: ["depth", "id"]}) {
      id
    }
    firstParent {
      id
    }
    namespace
    depth
  }
  families(workflows: ["baz"], ids: ["FAM2", "FAM3", "FAM5"], sort: {keys: ["id"]}) {
    id
        childTasks {
          id
        }
    childFamilies {
      id
    }
    parents (sort: {keys: ["depth", "id"]}){
      id
    }
    depth
  }
  familyProxies(workflows: ["baz"], ids: ["FAM2", "FAM3", "FAM5"], sort: {keys: ["id"]}) {
    id
    parents (sort: {keys: ["depth", "id"]}) {
      id
    }
    firstParent {
      id
    }
    childTasks {
      id
    }
    childFamilies {
      id
    }
    depth
  }
}

Although, first parent might be enough:

query {
  taskProxies(workflows: ["baz"], ids: ["foo", "qar"]) {
    id
    firstParent {
      id
    }
  }
  familyProxies(workflows: ["baz"], ids: ["FAM2", "FAM3", "FAM5"], sort: {keys: ["id"]}) {
    id
    firstParent {
      id
    }
  }
}

{
  "data": {
    "taskProxies": [
      {
        "id": "sutherlander|baz|20170101T0000+13|foo",
        "firstParent": {
          "id": "sutherlander|baz|20170101T0000+13|FAM3"
        }
      },
      {
        "id": "sutherlander|baz|20170101T0000+13|qar",
        "firstParent": {
          "id": "sutherlander|baz|20170101T0000+13|FAM"
        }
      }
    ],
    "familyProxies": [
      {
        "id": "sutherlander|baz|20170101T0000+13|FAM2",
        "firstParent": {
          "id": "sutherlander|baz|20170101T0000+13|FAM"
        }
      },
      {
        "id": "sutherlander|baz|20170101T0000+13|FAM3",
        "firstParent": {
          "id": "sutherlander|baz|20170101T0000+13|FAM2"
        }
      },
      {
        "id": "sutherlander|baz|20170101T0000+13|FAM5",
        "firstParent": {
          "id": "sutherlander|baz|20170101T0000+13|root"
        }
      }
    ]
  }
}

kinow commented 5 years ago

I will move the discussion of the tree family elsewhere for now, probably Riot, and reply here later with what was decided there.

kinow commented 5 years ago

Thanks @dwsutherland & @hjoliver for the patience on Riot.

The current query used by the Tree.vue view, given to the LiveWorkflowService to query/poll data is:

{
  workflows(ids: ["WORKFLOW_ID"]) {
    id
    name
    status
    owner
    host
    port
    taskProxies(sort: { keys: ["cyclePoint"] }) {
      id
      state
      cyclePoint
      latestMessage
      task {
        meanElapsedTime
        name
      }
      jobs(sort: { keys: ["submit_num"], reverse:true }) {
        id
        batchSysName
        batchSysJobId
        host
        startedTime
        submittedTime
        finishedTime
        state
        submitNum
      }
    }
  }
}

The JavaScript code does a .replace('WORKFLOW_ID', this.workflowId) (probably to be replaced by variables in #234).

So this query will always run for 1 workflow - for now.

The possible solution for now would be to get a flat list of familyProxies with their firstParent. So the new query will be:

{
  workflows(ids: ["WORKFLOW_ID"]) {
    id
    name
    status
    owner
    host
    port
    taskProxies(sort: { keys: ["cyclePoint"] }) {
      id
      state
      cyclePoint
      latestMessage
      task {
        meanElapsedTime
        name
      }
      jobs(sort: { keys: ["submit_num"], reverse:true }) {
        id
        batchSysName
        batchSysJobId
        host
        startedTime
        submittedTime
        finishedTime
        state
        submitNum
      }
    }
    // added family proxies here <------
    familyProxies (sort: { keys: ["firstParent"]}) {
      id
      firstParent {
        id
      }
    }
  }
}

For testing, I am using a suite based on the example provided by @dwsutherland above:

# file: ~/cylc-suites/families2/suite.rc

[scheduling]
initial cycle point = now

[[dependencies]]
[[[P1Y]]]
graph = "FAM3:succeed-all => FAM6"

[runtime]
    [[root]]
        init-script = echo "I'm first"
        env-script = echo "Hi first, I'm second"
        script = sleep 5; echo "RubyRubyRubyRuby"
        exit-script = echo 'Yay!'
        err-script = echo 'Boo!'
    [[FAM]]
    [[FAM2]]
        inherit = FAM
    [[FAM3]]
        inherit = FAM2
    [[foo]]
        inherit = FAM3
    [[bar]]
        inherit = FAM3
    [[FAM4]]
    [[FAM5]]
    [[FAM6]]
        inherit = FAM5
    [[qux]]
        inherit = FAM6
    [[qaz]]
        inherit = FAM4, FAM
    [[qar]]
        inherit = FAM, FAM4, FAM5

Running with Cylc 7, I get the following:

GIFrecord_2019-10-17_140730

A complete GraphQL response for the query can be seen here in this GitHub gist: https://gist.github.com/kinow/89fc7b126e38b19a2cf3de7d3287f695

The family part is returned as follows:

        "familyProxies": [
          {
            "id": "kinow|families2|20251017T1409+13|root",
            "firstParent": null
          },
          {
            "id": "kinow|families2|20261017T1409+13|root",
            "firstParent": null
          },
          {
            "id": "kinow|families2|20271017T1409+13|root",
            "firstParent": null
          },
          {
            "id": "kinow|families2|20251017T1409+13|FAM2",
            "firstParent": {
              "id": "kinow|families2|20251017T1409+13|FAM"
            }
          },
          {
            "id": "kinow|families2|20251017T1409+13|FAM3",
            "firstParent": {
              "id": "kinow|families2|20251017T1409+13|FAM2"
            }
          },
          {
            "id": "kinow|families2|20251017T1409+13|FAM6",
            "firstParent": {
              "id": "kinow|families2|20251017T1409+13|FAM5"
            }
          },
          {
            "id": "kinow|families2|20251017T1409+13|FAM5",
            "firstParent": {
              "id": "kinow|families2|20251017T1409+13|root"
            }
          },
          {
            "id": "kinow|families2|20251017T1409+13|FAM",
            "firstParent": {
              "id": "kinow|families2|20251017T1409+13|root"
            }
          },
          {
            "id": "kinow|families2|20261017T1409+13|FAM2",
            "firstParent": {
              "id": "kinow|families2|20261017T1409+13|FAM"
            }
          },
          {
            "id": "kinow|families2|20261017T1409+13|FAM3",
            "firstParent": {
              "id": "kinow|families2|20261017T1409+13|FAM2"
            }
          },
          {
            "id": "kinow|families2|20261017T1409+13|FAM6",
            "firstParent": {
              "id": "kinow|families2|20261017T1409+13|FAM5"
            }
          },
          {
            "id": "kinow|families2|20261017T1409+13|FAM",
            "firstParent": {
              "id": "kinow|families2|20261017T1409+13|root"
            }
          },
          {
            "id": "kinow|families2|20261017T1409+13|FAM5",
            "firstParent": {
              "id": "kinow|families2|20261017T1409+13|root"
            }
          },
          {
            "id": "kinow|families2|20271017T1409+13|FAM2",
            "firstParent": {
              "id": "kinow|families2|20271017T1409+13|FAM"
            }
          },
          {
            "id": "kinow|families2|20271017T1409+13|FAM3",
            "firstParent": {
              "id": "kinow|families2|20271017T1409+13|FAM2"
            }
          },
          {
            "id": "kinow|families2|20271017T1409+13|FAM",
            "firstParent": {
              "id": "kinow|families2|20271017T1409+13|root"
            }
          }
        ]

So the plan for this issue now is to use this new query first. And confirm nothing breaks. Then use the new familyProxies attribute of the data.workflow JS object (which goes into the Vuex store) to create the hierarchy for the families, and display in the Tree component as in Cylc 7 GUI (including the root family omitted).

Will start working on this next Monday, so feel free to comment if I missed anything :+1:

Cheers Bruno

kinow commented 5 years ago

(or if you have a better suite to test the families, that'd be helpful too :grimacing: )

dwsutherland commented 5 years ago

So the plan for this issue now is to use this new query first. And confirm nothing breaks. Then use the new familyProxies attribute of the data.workflow JS object (which goes into the Vuex store) to create the hierarchy for the families, and display in the Tree component as in Cylc 7 GUI (including the root family omitted).

Probably want firstParent field in taskProxies also... Or childTasks under familyProxies (in which case root can have childTasks)...

kinow commented 5 years ago

Good point @dwsutherland ! Otherwise how would I know how to add the tasks to the families 😆

hjoliver commented 5 years ago

Family tree graph by Cylc 7 for your example above.
fam

Note multiple inheritance, but only first-parents are used for the linear tree in the UI:

$ cylc list -t -b fam
root 
 ├─FAM 
 │ └─FAM2 
 │   └─FAM3 
 │     ├─bar 
 │     └─foo 
 └─FAM5 
   └─FAM6 
     └─qux

hjoliver commented 5 years ago

(or if you have a better suite to test the families, that'd be helpful too )

Any suite will do, so long as it has several levels of family, and some tasks at the bottom (which yours does).

kinow commented 5 years ago

Created WIP PR, but before changing more the code, first I will spend some time trying to understand where/how it was done in Cylc 7 :+1: and perhaps adapt (i.e. shamelessly copy) some code :grimacing:

hjoliver commented 5 years ago

@kinow - can you comment on which approach you've taken in the PR. As I recall we've discussed (but not yet decided for sure):

dense, concrete tree query (i.e server returns the nested structure with full node data for specific cycle points, in the leaves; entirely specific to the tree view)
flat list of nodes plus sparse concrete tree structure (i.e. tree with specific cycles points but no node data in the leaves; UI to populate that from the shared flat list)
flat list of nodes plus abstract family tree (i.e. generic family inheritance info, without specific cycle points - UI constructs the nested tree for specific cycle points).

(By "concrete" I mean the abstract/generic structure has been "instantiated" for the specific cycle points currently in the workflow task pool).

kinow commented 5 years ago

Sure @hjoliver

dense, concrete tree query (i.e server returns the nested structure with full node data for specific cycle points, in the leaves; entirely specific to the tree view)

From what I understood, that's not possible now. The cycle points information, for example, is bundled withing a TaskProxy. Our current approach uses a simple (almost flat) structure with 1 workflow ---> N taskProxies.

Then within a TaskProxy, we have the cyclepoint (which we are extracting out to create the hierarchy for the tree view at the moment), the Task, and the Job.

flat list of nodes plus sparse concrete tree structure (i.e. tree with specific cycles points but no node data in the leaves; UI to populate that from the shared flat list)

That's closer to what we have at the moment I think, but we have the node data in the leaves. This is/was supposed to be our shared flat list. Each view, when accessed the first time subscribes to a GraphQL query called root. If there is already a root query registered, then the query is merged/augmented. In theory it was supposed to have the data for a workflow that could be used for graph and tree and other views - from what I understood.

flat list of nodes plus abstract family tree (i.e. generic family inheritance info, without specific cycle points - UI constructs the nested tree for specific cycle points).

This is what I have in mind for this issue. We have a flat list of nodes already, with data. i.e. a workflow, with several task proxies without any hierarchy.

I want to check first if that's the same in Cylc 7 GUI (and try to learn more how we keep this data in Cylc 8). Then possibly keep the current code that constructs the hierarchy 1 workflow --> N cycle points --> N task proxies.

In the WIP PR I've added to the GraphQL workflows item a familyProxies { id, firstParent}, and also added firstParent to the Task Proxy.

With this, I believe we should be able to create the families hierarchy. The first parent of a family proxy contains, in its ID, the cycle point too. So the first approach I thought yesterday was to simply:

for a given workflow extract the cycle point hierarchy
for each cycle point then insert the family hierarchy
go through the task proxies and attach them to the right family
in the UI, hide the root family as in Cylc 7

The Vuex store data structure will be ~flat (i.e. a workflow with many task proxies, and many family proxies). The Graph View, from what I remember will augment the query with nodesEdges. So the final structure, shared by components, would be something like:

{
  workflow {
    id
    ...
    taskProxies { ... }
    familyProxies { ... }
    nodeEdges { ... }
  }
}

kinow commented 5 years ago

(Note: so far, we only have 1 workflow in the app... there were discussions around multiple workflows in both GraphQL and UI issues/PR's, but I assume we will need to discuss if it's doable, and how to implement it later... the only reason why the workflow is there, is because the JS code is using that as a root node to extract information... but in theory we could even remove it, update the JS code, and have a structure with { taskProxies: {}, familyProxies: {}, nodesEdges: {} })

kinow commented 5 years ago

(feel free to ignore this comment, it's a brain dump for myself, that I wrote while debugging cylc gui $workflow)

In Cylc 7 the GUI renders the tree in lib/cylc/gui/updater_tree, with a TreeUpdater thread object. Its update method is called similarly to Vue components' render method.

This thread shares data with another thread, the updated thread, from lib/cylc/gui/updater.py. This Updater object instance is passed in the constructor of TreeUpdater, and is responsible for fetching data about suites. This Updater object is similar to our WorkflowService in Cylc 8 Web GUI.

The updater uses SuiteRuntimeServiceClient to fetch data, in the same manner WorkflowService uses WebSocket/polling to talk to the GraphQL backend, which uses the ZMQ client to fetch data. The data returned contains information about ancestors, descendants, task summaries, etc.

In the UpdaterThread thread object, there is a update_gui method. This one will use the data structures mentioned before, to render the components. It uses a GtkTreeStore for the hierarchical data, which is the equivalent to the current workflowsTree in our Vuex Store, i.e. this is the data structure with hierarchy used to actually render the GUI.

When updating the GUI, it also parses the Task ID (which we will have to do with family IDs actually), calculates the task progress (which we are doing already, in the exact same way). It also ignores the family root, inserting instead the cycle point (we will probably have to do something similar).

Then part of the hierarchy for the tree is created on these few lines at the end of the method (which I will look with more calm later, but looks useful for the Web GUI): https://github.com/cylc/cylc-flow/blob/de7d938496e82dbdfb165938145670dd8e801efd/lib/cylc/gui/updater_tree.py#L359-L377

But where the data is actually synced between what came from the task pool and what is displayed in the GTK UI, is done here: https://github.com/cylc/cylc-flow/blob/de7d938496e82dbdfb165938145670dd8e801efd/lib/cylc/gui/updater_tree.py#L397-L501

From what I understood from the code, for the workflow

    [[root]]
        init-script = echo "I'm first"
        env-script = echo "Hi first, I'm second"
        script = sleep 5; echo "RubyRubyRubyRuby"
        exit-script = echo 'Yay!'
        err-script = echo 'Boo!'
    [[FAM]]
    [[FAM2]]
        inherit = FAM
    [[FAM3]]
        inherit = FAM2
    [[foo]]
        inherit = FAM3
    [[bar]]
        inherit = FAM3
    [[FAM4]]
    [[FAM5]]
    [[FAM6]]
        inherit = FAM5
    [[qux]]
        inherit = FAM6
    [[qaz]]
        inherit = FAM4, FAM
    [[qar]]
        inherit = FAM, FAM4, FAM5

It will iterate cycle points, and at one cycle point qux's hierarchy would be created as FAM5 / FAM6 / qux. First the code iterates through the families FAM5, then FAM6 keeping a reference to this last one (root was discarded earlier on). Then it will get qux, already knowing its parent. But this qux won't have the node data.

Before inserting the node into the GTK tree, the code retrieves the node data from another data structure, and then finally inserts it.

The old code looks a bit more complicated because it needs to keep track of adding rows/columns. Whereas the Vue.js code simply updates the data structure, but we don't have access to rows/columns. Instead we update an object that is reactive, triggering the update of an object in the virtual DOM (which later reflects in an updated DOM element).

So I think we will have a very similar approach. With the Vuex store data being quasi-flat. And a the existing workflowTree playing the same role as TreeUpdater.ttreestore GTK tree store, used to render the data with hierarchy.

hjoliver commented 5 years ago

Nice description :+1: (And good code-reading).

The old code looks a bit more complicated ...

It was pretty horrible, as I recall.

... Whereas the Vue.js code simply updates the data structure, but we don't have access to rows/columns. Instead we update an object that is reactive, triggering the update of an object in the virtual DOM (which later reflects in an updated DOM element).

Yeah, this is frickin' brilliant. Go the virtual DOM!!! :boom:

kinow commented 5 years ago

Note to self:

Need to add group state, in other words, display the task state, then compute the family state based on its children tasks, repeat that for any parent family of this family and, finally, compute the cycle point state too.

Screen shot of what it looks like in Cylc 7 GUI

kinow commented 5 years ago

The icons for the family state, and also the group-state of the cycle point (based on its direct children states) has been implemented.

Rebase/squashed commits in the PR branch. Found some issues while testing with the complex workflow. Modified the code in my laptop to print error messages, as debugging while running the complex workflow works, but everything gets super slow.

bla

One of the errors was due to firstParent of the TaskProxy object having the value null. It was for tasks that were demoting the first parent, using inherit = None, LOCAL, for example.

kinow commented 5 years ago

The issue described above is being addressed in this PR: https://github.com/cylc/cylc-flow/pull/3418 (thanks @dwsutherland for the super fast work)

cylc / cylc-ui

Tree component must correctly render families #229