argoproj / argo-workflows

Workflow Engine for Kubernetes
https://argo-workflows.readthedocs.io/
Apache License 2.0
15.11k stars 3.21k forks source link

3.5.6+ `items.status.nodes` disappeared from `/api/v1/workflows` endpoint for completed Workflows #13098

Open jacek-jablonski opened 5 months ago

jacek-jablonski commented 5 months ago

Pre-requisites

What happened/what you expected to happen?

While updating argo-workflows to version v3.5.6, I noticed that the status.nodes field disappeared for completed workflows in endpoint /api/v1/workflows. If the workflow is in the Running status, this field still appears. Is this expected behavior?

I cannot force this field to appear with: https://argo-workflows.xxx/api/v1/workflows/?listOptions.limit=50&fields=items.status.nodes Up to v3.5.5 it was working fine.

Version

v3.5.6

agilgur5 commented 5 months ago

Follow-up from this Slack thread. Per thread, nodeStatusOffload is not enabled and this only occurs on the list endpoint.

Given that it only occurs on list and completed Workflows, I'm guessing this is only impacting Archived Workflows, and so would be caused by https://github.com/argoproj/argo-workflows/pull/12912#discussion_r1572901283

agilgur5 commented 5 months ago

I cannot force this field to appear with: https://argo-workflows.xxx/api/v1/workflows/?listOptions.limit=50&fields=items.status.nodes Up to v3.5.5 it was working fine.

This part would be the regression -- fields is supposed to be dynamic.

If the workflow is in the Running status, this field still appears. Is this expected behavior?

So in 3.4, the Archived Workflows list wouldn't return nodes by default either per the diff I commented on in https://github.com/argoproj/argo-workflows/pull/11121#discussion_r1469101802.

The UI's default list request also doesn't include it.

So I think it being included in 3.5 was actually an unintended bug -- part of the whole performance regression that #12912 fixed that I discovered in https://github.com/argoproj/argo-workflows/pull/11121#discussion_r1469101802

agilgur5 commented 5 months ago

So in 3.4, the Archived Workflows list wouldn't return nodes by default either per the diff I commented on in #11121 (comment).

Looking at it a bit more, it may not have been possible at all for Archived Workflows list in 3.4?

So this is perhaps intended behavior for Archived Workflows lists, just that in 3.5 they're also available in the regular list API as well.

jacek-jablonski commented 5 months ago

So, to summarize: status.nodes field appeared in 3.5 list endpoint by a mistake and it shouldn't be there. And by design it won't return?

agilgur5 commented 5 months ago

Specifically for Archived Workflows, not necessarily live Workflows.

jacek-jablonski commented 1 month ago

@agilgur5 items.metadata.ownerReferences seems also missing for archived workflows. There seems to be quite a big discrepancy between what is returned for archived and live workflows on a single API endpoint.

agilgur5 commented 1 month ago

Yea anything that doesn't have it's own column in the archived DB is not returned right now. And adding more columns or plucking more from the JSON blob causes performance issues per #13295 etc 😕

jacek-jablonski commented 1 month ago

Is it a subject to change in 3.6? Or will it stay the same?

agilgur5 commented 1 month ago

They're technically supposed to be set dynamically per your fields parameters as I mentioned in https://github.com/argoproj/argo-workflows/pull/12912#discussion_r1581619939. If you can make that work, I'd probably consider that a patch fix in 3.5 even, but at a glance it looks non-trivial to do

jacek-jablonski commented 1 month ago

@agilgur5 I've set in my config:

worfklowDefaults:
  spec:
    ttlStrategy:
      secondsAfterCompletion: 345600 # 4 days

so that I could get some more information on the completed workflows, but despite the fact they are still live in k8s cluster, some fields are still missing (seems that, if it is archived, the data goes from the db, not from the cluster). It's quite problematic if you have a custom GUI and prevents us from updating to 3.5.6+.

jacek-jablonski commented 1 month ago

If I disable the archive, the fields are visible again for completed workflows, so that confirms the above situation.