Closed kinow closed 5 years ago
Quickly hacked five
to have a BAR
family with two children bar1
and bar2
:
[meta]
title = "Inter-cycle dependence + a cold-start task"
[cylc]
UTC mode = True
[scheduling]
#runahead limit = 120
initial cycle point = 20130808T00
final cycle point = 20150810T00
[[dependencies]]
[[[R1]]]
graph = "prep => foo"
[[[PT12H]]]
graph = "foo[-PT12H] => foo => BAR"
[runtime]
[[root]]
script = "sleep 2; echo Hi!"
[[BAR]]
script = "sleep 2; echo Family!"
[[bar1]]
inherit = "BAR"
[[bar2]]
inherit = "BAR"
Here's the result:
The children are displayed. The family is never displayed in the tree.
Here's the GraphQL query executed:
{
workflows(ids: ["five"]) {
id
name
status
owner
host
port
taskProxies {
id
state
cyclePoint
task {
meanElapsedTime
name
}
jobs {
id
host
startedTime
state
submitNum
}
}
}
}
And the response:
{
"data": {
"workflows": [
{
"id": "kinow|five",
"name": "five",
"status": "running",
"owner": "kinow",
"host": "ranma",
"port": 43060,
"taskProxies": [
{
"id": "kinow|five|20130829T0000Z|bar2",
"state": "succeeded",
"cyclePoint": "20130829T0000Z",
"task": {
"meanElapsedTime": 1.899999976158142,
"name": "bar2"
},
"jobs": [
{
"id": "kinow|five|20130829T0000Z|bar2|1",
"host": "localhost",
"startedTime": "2019-09-16T05:51:23Z",
"state": "succeeded",
"submitNum": 1
}
]
},
{
"id": "kinow|five|20130829T1200Z|bar2",
"state": "submitted",
"cyclePoint": "20130829T1200Z",
"task": {
"meanElapsedTime": 1.899999976158142,
"name": "bar2"
},
"jobs": [
{
"id": "kinow|five|20130829T1200Z|bar2|1",
"host": "localhost",
"startedTime": "",
"state": "submitted",
"submitNum": 1
}
]
},
{
"id": "kinow|five|20130829T0000Z|bar1",
"state": "succeeded",
"cyclePoint": "20130829T0000Z",
"task": {
"meanElapsedTime": 1.899999976158142,
"name": "bar1"
},
"jobs": [
{
"id": "kinow|five|20130829T0000Z|bar1|1",
"host": "localhost",
"startedTime": "2019-09-16T05:51:23Z",
"state": "succeeded",
"submitNum": 1
}
]
},
{
"id": "kinow|five|20130829T1200Z|foo",
"state": "succeeded",
"cyclePoint": "20130829T1200Z",
"task": {
"meanElapsedTime": 1.899999976158142,
"name": "foo"
},
"jobs": [
{
"id": "kinow|five|20130829T1200Z|foo|1",
"host": "localhost",
"startedTime": "2019-09-16T05:51:23Z",
"state": "succeeded",
"submitNum": 1
}
]
},
{
"id": "kinow|five|20130829T0000Z|foo",
"state": "succeeded",
"cyclePoint": "20130829T0000Z",
"task": {
"meanElapsedTime": 1.899999976158142,
"name": "foo"
},
"jobs": [
{
"id": "kinow|five|20130829T0000Z|foo|1",
"host": "localhost",
"startedTime": "2019-09-16T05:51:18Z",
"state": "succeeded",
"submitNum": 1
}
]
},
{
"id": "kinow|five|20130830T0000Z|foo",
"state": "submitted",
"cyclePoint": "20130830T0000Z",
"task": {
"meanElapsedTime": 1.899999976158142,
"name": "foo"
},
"jobs": [
{
"id": "kinow|five|20130830T0000Z|foo|1",
"host": "localhost",
"startedTime": "",
"state": "submitted",
"submitNum": 1
}
]
},
{
"id": "kinow|five|20130829T1200Z|bar1",
"state": "submitted",
"cyclePoint": "20130829T1200Z",
"task": {
"meanElapsedTime": 1.899999976158142,
"name": "bar1"
},
"jobs": [
{
"id": "kinow|five|20130829T1200Z|bar1|1",
"host": "localhost",
"startedTime": "",
"state": "submitted",
"submitNum": 1
}
]
}
]
}
]
}
}
So we do not have families in the response yet. My guess is that we will have to change the query, to something like:
{
workflows(ids: ["five"]) {
id
name
status
owner
host
port
familyProxies {
id
name
state
childTasks {
id
state
cyclePoint
task {
meanElapsedTime
name
}
jobs {
id
host
startedTime
state
submitNum
}
}
}
}
}
But probably need to confirm with @dwsutherland if this query makes sense (e.g. I am excluding familyProxies / family
because I only need the name for the tree, and it seems like the proxy/family share the same name?). Or if there is a better alternative to retrieve workflow + family + task + job information.
@kinow - I'm pretty sure @dwsutherland said we have the flexibility to get a nested tree back via GraphQL by just structuring the query the right way, although I couldn't pull that query out of the hat myself yet.
However, that conflicts with our shared data store concept: flat list of nodes and edges, with nodes used by all views and nodes+edges used by the graph view. If we do that we'll have to compute the family tree structure in the UI, for the tree view, by using the abstract tree info.
I guess the good news is GraphQL gives us the power to go either way, but the bad news (for you at the moment :-) is we decided to go with common data store for optimized network transfer (and to avoid cross-view duplication) at least as a first cut.
Showing the tree structure isn't a blocker for alpha-1 relase IMO, unless it's easy (after getting the leaf data in first).
@oliver-sanders - does this conform with your understanding?
Showing the tree structure isn't a blocker for alpha-1 relase IMO, unless it's easy (after getting the leaf data in first).
Much better that way. And work on the leaf data starting tomorrow morning! :steam_locomotive:
@kinow - I'm pretty sure @dwsutherland said we have the flexibility to get a nested tree back via GraphQL by just structuring the query the right way, although I couldn't pull that query out of the hat myself yet.
Was something like this:
fragment treeNest on FamilyProxy {
name
cyclePoint
state
depth
childTasks(ids: $nIds, states: $nStates, mindepth: $minDepth, maxdepth: $maxDepth) {
id
state
latestMessage
depth
jobs {
id
host
batchSysName
batchSysJobId
submittedTime
startedTime
finishedTime
submitNum
}
}
}
query tree($wIds: [ID], $nIds: [ID], $nStates: [String], $minDepth: Int, $maxDepth: Int) {
workflows(ids: $wIds) {
id
name
status
stateTotals
treeDepth
}
familyProxies(workflows: $wIds, ids: ["root"]) {
...treeNest
childFamilies(mindepth: $minDepth, maxdepth: $maxDepth) {
...treeNest
childFamilies(mindepth: $minDepth, maxdepth: $maxDepth) {
...treeNest
childFamilies(mindepth: $minDepth, maxdepth: $maxDepth) {
...treeNest
childFamilies(mindepth: $minDepth, maxdepth: $maxDepth) {
...treeNest
}
}
}
}
}
}
{
"wIds": ["baz"],
"nIds": ["20*|*"],
"nStates": ["succeeded", "waiting", "held"],
"minDepth": 0,
"maxDepth": 4
}
However, that conflicts with our shared data store concept: flat list of nodes and edges, with nodes used by all views and nodes+edges used by the graph view. If we do that we'll have to compute the family tree structure in the UI, for the tree view, by using the abstract tree info.
I think we may need to revisit this one... Because we have to consider:
Having said that.. you could change the above inheritance structure to just return the IDs, and then fill it in from the flat list of family and task nodes:
fragment treeNest on FamilyProxy {
id
childTasks(ids: $nIds, states: $nStates, mindepth: $minDepth, maxdepth: $maxDepth) {
id
jobs {
id
}
}
}
query tree($wIds: [ID], $nIds: [ID], $nStates: [String], $minDepth: Int, $maxDepth: Int) {
workflows(ids: $wIds) {
id
}
familyProxies(workflows: $wIds, ids: ["root"]) {
...treeNest
childFamilies(mindepth: $minDepth, maxdepth: $maxDepth) {
...treeNest
childFamilies(mindepth: $minDepth, maxdepth: $maxDepth) {
...treeNest
childFamilies(mindepth: $minDepth, maxdepth: $maxDepth) {
...treeNest
childFamilies(mindepth: $minDepth, maxdepth: $maxDepth) {
...treeNest
}
}
}
}
}
}
{
"wIds": ["baz"],
"nIds": ["20*|*"],
"nStates": ["succeeded", "waiting", "held"],
"minDepth": 0,
"maxDepth": 4
}
Thanks @dwsutherland ! Looks like it may be doable, but can't guarantee it would be doable for the alpha release (cc @hjoliver, really really unlikely to include this one I think).
Moved to 0.2
@dwsutherland
However, that conflicts with our shared data store concept: flat list of nodes and edges, with nodes used by all views and nodes+edges used by the graph view. If we do that we'll have to compute the family tree structure in the UI, for the tree view, by using the abstract tree info.
I think we may need to revisit this one... Because we have to consider: [pagination and different view requirements...]
Agreed, we need to put some thought into this to figure out which way to go.
My task for the next week. Will try first to fix the issue to support variables in queries. Then work on this one :+1:
@oliver-sanders - can you comment on the UI data-store implications of task family tree structure, as raised above? Do we request nested data and duplicate the data (if non-nested views are also active) or - as planned - share flat data and compute the tree in the UI?
can you comment on the UI data-store implications of task family tree structure, as raised above
I think this undermines the central data store pretty badly. There would still be a large amount of overlap between subscriptions (as each view wants pretty much the same data) but it would be very hard to compute the overlap and merge the subscriptions accordingly. Consequences:
There's a tradeoff here, I originally envisaged a flat data store as this makes memory and network nice and efficient, keeping the load away from the workflow and UI Server. The tradeoff is that the UI has to do a bit of cross-referencing to get things to work. My logic was that the memory and network arguments were much more important and that whilst cross-referencing between different flat structures might be sub-optimal, it only has to be done on a case by case basis when data actually changes (the joy of a responsive, data-driven system). So maybe the first load takes a little bit longer, but from then on in, the performance cost aught to be minimal.
So personally I would much rather have a central data store with everything in a simple standard data structure with all of the views providing hooks to respond to data structure changes. This is more of a "watch" pattern which doesn't seem to fit perfectly with Vue (maybe we need to look into observables or whatever they're called).
tldr;
If looking up the family structure for each task is proving too computationally expensive, could we not just request the list of linearised ancestors for each task? This information is known by the workflow and should be cheep to pass through.
@kinow - Oliver has explained the situation nicely there.
We don't have any evidence yet that the extra UI computation (computing the family tree client-side) will be a problem, and we can always change tack later if it turns out that it is a problem (but I'm skeptical that it will be bad compared to other client-side computation that we can't avoid anyway).
So on that basis we will stick to the original plan: the UI should request flat data for easy sharing between views (and minimal memory footprint) and compute the family tree client-side.
@MartinRyan - this will affect the graph view too in some sense. Even though you don't want the dependency data in tree form, we need family collapse in the graph. (And that has to be computed client-side, unlike for Cylc-7).
The old UI computes the family tree from:
In the new UI we can either:
The new way (OR) might be easier...
We don't have any evidence yet that the extra UI computation (computing the family tree client-side) will be a problem
https://github.com/cylc/cylc-ui/issues/227#issuecomment-534300509
The scripting part is a bit worrying, but at this moment I am chasing memory leaks.
But my impression is that for the complex suite, with polling (! might get better when websockets/data updates is used), the UI performance downgrades significantly because of the re-computing of data, and consequent update re-render of DOM elements. This, combined with several SVG elements being rendered too.
The memory leaks are hindering the performance too. But once they are fixed, until we have websockets working with data-driven updates, users might struggle to run complex workflow with the tree component.
we can always change tack later if it turns out that it is a problem (but I'm skeptical that it will be bad compared to other client-side computation that we can't avoid anyway).
π
So on that basis we will stick to the original plan: the UI should request flat data for easy sharing between views (and minimal memory footprint) and compute the family tree client-side.
Alright. The query that I was going to use was based on the example with fragments and nested levels provided by @dwsutherland a few comments above βοΈ . Does that mean that actually we will have another query that returns a single level with all information about tasks / families / jobs / etc ? With no hierarchy?
(obviously, the performance problems above could all be due to bugs in the tree component too, or some bad design in the tree component π¬ )
@kinow - I assumed your "complex suite" + treeview problem is a memory leak (as I think you suggested?) as it grew over time.
But once they are fixed, until we have websockets working with data-driven updates
I'm not worried about that - we will definitely have websockets and data-driven updates before anyone needs to run real suites.
Not only memory leaks. Even before we had the tree component, when we were using the vue-ads-table-tree component, it was already impossible to run the complex suite.
The five suite worked fine for me, but during the development of the tree component I wanted to make sure I didn't miss any possible bug, and decided to look at performance with the complex suite.
That showed that we had memory leaks, and that the complex suite was still not usable with the cylc web UI.
the UI performance downgrades significantly because of the re-computing of data, and consequent update re-render of DOM elements.
I don't have a good feel for what gets recomputed and what gets rerendered in the new UI, and what we can expect peformance-wise compared to native Python + PyGTK.
Once we have all the tree view basics in place, we can start looking at performance.
Note that the proposed "n-distance" changes will help by greatly reducing the number of waiting and succeeded tasks that the UI has to display (relative the the complete task pool as shown now).
Once we have all the tree view basics in place, we can start looking at performance.
π
Note that the proposed "n-distance" changes will help by greatly reducing the number of waiting and succeeded tasks that the UI has to display (relative the the complete task pool as shown now).
π
Updating the tree view and component shouldn't be too hard if necessary. I just need to understand what will change and where, and see where I can help (happy to dive into the GraphQL backend again if necessary)
Not only memory leaks ...
The way I see it, we've only just reached the point we have all components of the new architecture working together, and I'm not expecting great performance yet. Having got to this point though, particularly once the tree view has all basic functionality, we do need to start thinking about performance.
I don't see why we can't do both...
You can't dribble in updates very easily with only a nested structure (because you'd need to send the structure with it each time, since the UI would need to know where in the graph/tree structure to update).. But a flat list makes it easy to update just the nodes..
The data-driven (nested only) approach would be efficient on the UI if your view is limited, you only need a fraction of the full data set, which means updates and the amount of data held by the UI is small..
And yes, the task and family proxies have parent and children fields..
I don't see why we can't do both...
The nested one giving the structure (only ids returned) with query/subscription (which only updates when a cycle of ids is added or removed)
That's a good point. Somewhat analogous to the graph edges (structural information, without all the data associated with each node). For a huge suite it might still be a signficant memory hit though, in which case it may still be better to avoid it if the tree computation can be done client-side very easily and quickly. Worth thinking about though - ping @oliver-sanders
And yes, the task and family proxies have parent and children fields..
Is that the full list (back to root), not just immediate parents and children?
And yes, the task and family proxies have parent and children fields..
Is that the full list (back to root), not just immediate parents and children?
I'll just check... other question is whether it's in order (if a flat list)
Even before we had the tree component, when we were using the vue-ads-table-tree component, it was already impossible to run the complex suite.
(for the record, via Riot chat, the current treeview handles the complex suite pretty well so long as you don't allow too many jobs to run at once on the workflow host)
Given an inheritance:
[runtime]
[[root]]
init-script = echo "I'm first"
env-script = echo "Hi first, I'm second"
script = sleep 5; echo "RubyRubyRubyRuby"
exit-script = echo 'Yay!'
err-script = echo 'Boo!'
[[FAM]]
[[FAM2]]
inherit = FAM
[[FAM3]]
inherit = FAM2
[[foo]]
inherit = FAM3
[[bar]]
inherit = FAM3
[[FAM4]]
[[FAM5]]
[[FAM6]]
inherit = FAM5
[[qux]]
inherit = FAM6
[[qaz]]
inherit = FAM4, FAM
[[qar]]
inherit = FAM, FAM4, FAM5
You can gather the inheritance information via the fields:
query {
tasks(workflows: ["baz"], ids: ["foo"]) {
id
namespace
depth
}
taskProxies(workflows: ["baz"], ids: ["foo", "qar"]) {
id
parents (sort: {keys: ["depth", "id"]}) {
id
}
firstParent {
id
}
namespace
depth
}
families(workflows: ["baz"], ids: ["FAM2", "FAM3", "FAM5"], sort: {keys: ["id"]}) {
id
childTasks {
id
}
childFamilies {
id
}
parents (sort: {keys: ["depth", "id"]}){
id
}
depth
}
familyProxies(workflows: ["baz"], ids: ["FAM2", "FAM3", "FAM5"], sort: {keys: ["id"]}) {
id
parents (sort: {keys: ["depth", "id"]}) {
id
}
firstParent {
id
}
childTasks {
id
}
childFamilies {
id
}
depth
}
}
Although, first parent might be enough:
query {
taskProxies(workflows: ["baz"], ids: ["foo", "qar"]) {
id
firstParent {
id
}
}
familyProxies(workflows: ["baz"], ids: ["FAM2", "FAM3", "FAM5"], sort: {keys: ["id"]}) {
id
firstParent {
id
}
}
}
{
"data": {
"taskProxies": [
{
"id": "sutherlander|baz|20170101T0000+13|foo",
"firstParent": {
"id": "sutherlander|baz|20170101T0000+13|FAM3"
}
},
{
"id": "sutherlander|baz|20170101T0000+13|qar",
"firstParent": {
"id": "sutherlander|baz|20170101T0000+13|FAM"
}
}
],
"familyProxies": [
{
"id": "sutherlander|baz|20170101T0000+13|FAM2",
"firstParent": {
"id": "sutherlander|baz|20170101T0000+13|FAM"
}
},
{
"id": "sutherlander|baz|20170101T0000+13|FAM3",
"firstParent": {
"id": "sutherlander|baz|20170101T0000+13|FAM2"
}
},
{
"id": "sutherlander|baz|20170101T0000+13|FAM5",
"firstParent": {
"id": "sutherlander|baz|20170101T0000+13|root"
}
}
]
}
}
I will move the discussion of the tree family elsewhere for now, probably Riot, and reply here later with what was decided there.
Thanks @dwsutherland & @hjoliver for the patience on Riot.
The current query used by the Tree.vue
view, given to the LiveWorkflowService
to query/poll data is:
{
workflows(ids: ["WORKFLOW_ID"]) {
id
name
status
owner
host
port
taskProxies(sort: { keys: ["cyclePoint"] }) {
id
state
cyclePoint
latestMessage
task {
meanElapsedTime
name
}
jobs(sort: { keys: ["submit_num"], reverse:true }) {
id
batchSysName
batchSysJobId
host
startedTime
submittedTime
finishedTime
state
submitNum
}
}
}
}
The JavaScript code does a .replace('WORKFLOW_ID', this.workflowId)
(probably to be replaced by variables in #234).
So this query will always run for 1 workflow - for now.
The possible solution for now would be to get a flat list of familyProxies
with their firstParent
. So the new query will be:
{
workflows(ids: ["WORKFLOW_ID"]) {
id
name
status
owner
host
port
taskProxies(sort: { keys: ["cyclePoint"] }) {
id
state
cyclePoint
latestMessage
task {
meanElapsedTime
name
}
jobs(sort: { keys: ["submit_num"], reverse:true }) {
id
batchSysName
batchSysJobId
host
startedTime
submittedTime
finishedTime
state
submitNum
}
}
// added family proxies here <------
familyProxies (sort: { keys: ["firstParent"]}) {
id
firstParent {
id
}
}
}
}
For testing, I am using a suite based on the example provided by @dwsutherland above:
# file: ~/cylc-suites/families2/suite.rc
[scheduling]
initial cycle point = now
[[dependencies]]
[[[P1Y]]]
graph = "FAM3:succeed-all => FAM6"
[runtime]
[[root]]
init-script = echo "I'm first"
env-script = echo "Hi first, I'm second"
script = sleep 5; echo "RubyRubyRubyRuby"
exit-script = echo 'Yay!'
err-script = echo 'Boo!'
[[FAM]]
[[FAM2]]
inherit = FAM
[[FAM3]]
inherit = FAM2
[[foo]]
inherit = FAM3
[[bar]]
inherit = FAM3
[[FAM4]]
[[FAM5]]
[[FAM6]]
inherit = FAM5
[[qux]]
inherit = FAM6
[[qaz]]
inherit = FAM4, FAM
[[qar]]
inherit = FAM, FAM4, FAM5
Running with Cylc 7, I get the following:
A complete GraphQL response for the query can be seen here in this GitHub gist: https://gist.github.com/kinow/89fc7b126e38b19a2cf3de7d3287f695
The family part is returned as follows:
"familyProxies": [
{
"id": "kinow|families2|20251017T1409+13|root",
"firstParent": null
},
{
"id": "kinow|families2|20261017T1409+13|root",
"firstParent": null
},
{
"id": "kinow|families2|20271017T1409+13|root",
"firstParent": null
},
{
"id": "kinow|families2|20251017T1409+13|FAM2",
"firstParent": {
"id": "kinow|families2|20251017T1409+13|FAM"
}
},
{
"id": "kinow|families2|20251017T1409+13|FAM3",
"firstParent": {
"id": "kinow|families2|20251017T1409+13|FAM2"
}
},
{
"id": "kinow|families2|20251017T1409+13|FAM6",
"firstParent": {
"id": "kinow|families2|20251017T1409+13|FAM5"
}
},
{
"id": "kinow|families2|20251017T1409+13|FAM5",
"firstParent": {
"id": "kinow|families2|20251017T1409+13|root"
}
},
{
"id": "kinow|families2|20251017T1409+13|FAM",
"firstParent": {
"id": "kinow|families2|20251017T1409+13|root"
}
},
{
"id": "kinow|families2|20261017T1409+13|FAM2",
"firstParent": {
"id": "kinow|families2|20261017T1409+13|FAM"
}
},
{
"id": "kinow|families2|20261017T1409+13|FAM3",
"firstParent": {
"id": "kinow|families2|20261017T1409+13|FAM2"
}
},
{
"id": "kinow|families2|20261017T1409+13|FAM6",
"firstParent": {
"id": "kinow|families2|20261017T1409+13|FAM5"
}
},
{
"id": "kinow|families2|20261017T1409+13|FAM",
"firstParent": {
"id": "kinow|families2|20261017T1409+13|root"
}
},
{
"id": "kinow|families2|20261017T1409+13|FAM5",
"firstParent": {
"id": "kinow|families2|20261017T1409+13|root"
}
},
{
"id": "kinow|families2|20271017T1409+13|FAM2",
"firstParent": {
"id": "kinow|families2|20271017T1409+13|FAM"
}
},
{
"id": "kinow|families2|20271017T1409+13|FAM3",
"firstParent": {
"id": "kinow|families2|20271017T1409+13|FAM2"
}
},
{
"id": "kinow|families2|20271017T1409+13|FAM",
"firstParent": {
"id": "kinow|families2|20271017T1409+13|root"
}
}
]
So the plan for this issue now is to use this new query first. And confirm nothing breaks. Then use the new familyProxies
attribute of the data.workflow
JS object (which goes into the Vuex store) to create the hierarchy for the families, and display in the Tree component as in Cylc 7 GUI (including the root
family omitted).
Will start working on this next Monday, so feel free to comment if I missed anything :+1:
Cheers Bruno
(or if you have a better suite to test the families, that'd be helpful too :grimacing: )
So the plan for this issue now is to use this new query first. And confirm nothing breaks. Then use the new familyProxies attribute of the data.workflow JS object (which goes into the Vuex store) to create the hierarchy for the families, and display in the Tree component as in Cylc 7 GUI (including the root family omitted).
Probably want firstParent
field in taskProxies
also... Or childTasks
under familyProxies
(in which case root
can have childTasks
)...
Good point @dwsutherland ! Otherwise how would I know how to add the tasks to the families π
Family tree graph by Cylc 7 for your example above.
Note multiple inheritance, but only first-parents are used for the linear tree in the UI:
$ cylc list -t -b fam
root
ββFAM
β ββFAM2
β ββFAM3
β ββbar
β ββfoo
ββFAM5
ββFAM6
ββqux
(or if you have a better suite to test the families, that'd be helpful too )
Any suite will do, so long as it has several levels of family, and some tasks at the bottom (which yours does).
Created WIP PR, but before changing more the code, first I will spend some time trying to understand where/how it was done in Cylc 7 :+1: and perhaps adapt (i.e. shamelessly copy) some code :grimacing:
@kinow - can you comment on which approach you've taken in the PR. As I recall we've discussed (but not yet decided for sure):
(By "concrete" I mean the abstract/generic structure has been "instantiated" for the specific cycle points currently in the workflow task pool).
Sure @hjoliver
dense, concrete tree query (i.e server returns the nested structure with full node data for specific cycle points, in the leaves; entirely specific to the tree view)
From what I understood, that's not possible now. The cycle points information, for example, is bundled withing a TaskProxy
. Our current approach uses a simple (almost flat) structure with 1 workflow ---> N taskProxies
.
Then within a TaskProxy
, we have the cyclepoint (which we are extracting out to create the hierarchy for the tree view at the moment), the Task
, and the Job
.
flat list of nodes plus sparse concrete tree structure (i.e. tree with specific cycles points but no node data in the leaves; UI to populate that from the shared flat list)
That's closer to what we have at the moment I think, but we have the node data in the leaves. This is/was supposed to be our shared flat list. Each view, when accessed the first time subscribes to a GraphQL query called root
. If there is already a root
query registered, then the query is merged/augmented. In theory it was supposed to have the data for a workflow that could be used for graph and tree and other views - from what I understood.
flat list of nodes plus abstract family tree (i.e. generic family inheritance info, without specific cycle points - UI constructs the nested tree for specific cycle points).
This is what I have in mind for this issue. We have a flat list of nodes already, with data. i.e. a workflow, with several task proxies without any hierarchy.
I want to check first if that's the same in Cylc 7 GUI (and try to learn more how we keep this data in Cylc 8). Then possibly keep the current code that constructs the hierarchy 1 workflow --> N cycle points --> N task proxies
.
In the WIP PR I've added to the GraphQL workflows
item a familyProxies { id, firstParent}
, and also added firstParent
to the Task Proxy.
With this, I believe we should be able to create the families hierarchy. The first parent of a family proxy contains, in its ID, the cycle point too. So the first approach I thought yesterday was to simply:
root
family as in Cylc 7The Vuex store data structure will be ~flat (i.e. a workflow with many task proxies, and many family proxies). The Graph View, from what I remember will augment the query with nodesEdges. So the final structure, shared by components, would be something like:
{
workflow {
id
...
taskProxies { ... }
familyProxies { ... }
nodeEdges { ... }
}
}
(Note: so far, we only have 1 workflow in the app... there were discussions around multiple workflows in both GraphQL and UI issues/PR's, but I assume we will need to discuss if it's doable, and how to implement it later... the only reason why the workflow
is there, is because the JS code is using that as a root node to extract information... but in theory we could even remove it, update the JS code, and have a structure with { taskProxies: {}, familyProxies: {}, nodesEdges: {} }
)
(feel free to ignore this comment, it's a brain dump for myself, that I wrote while debugging cylc gui $workflow
)
In Cylc 7 the GUI renders the tree in lib/cylc/gui/updater_tree
, with a TreeUpdater
thread object. Its update
method is called similarly to Vue components' render
method.
This thread shares data with another thread, the updated thread, from lib/cylc/gui/updater.py
. This Updater
object instance is passed in the constructor of TreeUpdater
, and is responsible for fetching data about suites. This Updater
object is similar to our WorkflowService
in Cylc 8 Web GUI.
The updater uses SuiteRuntimeServiceClient
to fetch data, in the same manner WorkflowService
uses WebSocket/polling to talk to the GraphQL backend, which uses the ZMQ client to fetch data. The data returned contains information about ancestors, descendants, task summaries, etc.
In the UpdaterThread
thread object, there is a update_gui
method. This one will use the data structures mentioned before, to render the components. It uses a GtkTreeStore
for the hierarchical data, which is the equivalent to the current workflowsTree
in our Vuex Store, i.e. this is the data structure with hierarchy used to actually render the GUI.
When updating the GUI, it also parses the Task ID (which we will have to do with family IDs actually), calculates the task progress (which we are doing already, in the exact same way). It also ignores the family root
, inserting instead the cycle point (we will probably have to do something similar).
Then part of the hierarchy for the tree is created on these few lines at the end of the method (which I will look with more calm later, but looks useful for the Web GUI): https://github.com/cylc/cylc-flow/blob/de7d938496e82dbdfb165938145670dd8e801efd/lib/cylc/gui/updater_tree.py#L359-L377
But where the data is actually synced between what came from the task pool and what is displayed in the GTK UI, is done here: https://github.com/cylc/cylc-flow/blob/de7d938496e82dbdfb165938145670dd8e801efd/lib/cylc/gui/updater_tree.py#L397-L501
From what I understood from the code, for the workflow
[[root]]
init-script = echo "I'm first"
env-script = echo "Hi first, I'm second"
script = sleep 5; echo "RubyRubyRubyRuby"
exit-script = echo 'Yay!'
err-script = echo 'Boo!'
[[FAM]]
[[FAM2]]
inherit = FAM
[[FAM3]]
inherit = FAM2
[[foo]]
inherit = FAM3
[[bar]]
inherit = FAM3
[[FAM4]]
[[FAM5]]
[[FAM6]]
inherit = FAM5
[[qux]]
inherit = FAM6
[[qaz]]
inherit = FAM4, FAM
[[qar]]
inherit = FAM, FAM4, FAM5
It will iterate cycle points, and at one cycle point qux
's hierarchy would be created as FAM5 / FAM6 / qux
. First the code iterates through the families FAM5
, then FAM6
keeping a reference to this last one (root
was discarded earlier on). Then it will get qux
, already knowing its parent. But this qux
won't have the node data.
Before inserting the node into the GTK tree, the code retrieves the node data from another data structure, and then finally inserts it.
The old code looks a bit more complicated because it needs to keep track of adding rows/columns. Whereas the Vue.js code simply updates the data structure, but we don't have access to rows/columns. Instead we update an object that is reactive, triggering the update of an object in the virtual DOM (which later reflects in an updated DOM element).
So I think we will have a very similar approach. With the Vuex store data being quasi-flat. And a the existing workflowTree
playing the same role as TreeUpdater.ttreestore
GTK tree store, used to render the data with hierarchy.
Nice description :+1: (And good code-reading).
The old code looks a bit more complicated ...
It was pretty horrible, as I recall.
... Whereas the Vue.js code simply updates the data structure, but we don't have access to rows/columns. Instead we update an object that is reactive, triggering the update of an object in the virtual DOM (which later reflects in an updated DOM element).
Yeah, this is frickin' brilliant. Go the virtual DOM!!! :boom:
Note to self:
Need to add group state, in other words, display the task state, then compute the family state based on its children tasks, repeat that for any parent family of this family and, finally, compute the cycle point state too.
Screen shot of what it looks like in Cylc 7 GUI
The icons for the family state, and also the group-state of the cycle point (based on its direct children states) has been implemented.
Rebase/squashed commits in the PR branch. Found some issues while testing with the complex
workflow. Modified the code in my laptop to print error messages, as debugging while running the complex workflow works, but everything gets super slow.
One of the errors was due to firstParent
of the TaskProxy
object having the value null
. It was for tasks that were demoting the first parent, using inherit = None, LOCAL
, for example.
The issue described above is being addressed in this PR: https://github.com/cylc/cylc-flow/pull/3418 (thanks @dwsutherland for the super fast work)
@hjoliver reported on riot families are not properly displayed with the current Tree Component. Adding as a tentative for the alpha release.