kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.53k stars 1.59k forks source link

[frontend] run detail page get unexpected reloaded #10590

Open lethee opened 4 months ago

lethee commented 4 months ago

Environment

$ k get pod
NAME                                                     READY   STATUS    RESTARTS      AGE
admission-webhook-deployment-6c9678d48b-dl5dw            1/1     Running   0             35d
cache-server-574ddb7d97-t4s7l                            2/2     Running   0             35d
centraldashboard-5f7856dd97-88ksq                        2/2     Running   0             68m
jupyter-web-app-deployment-597655bf9c-gnngh              2/2     Running   0             28d
kubeflow-pipelines-profile-controller-6f6bc888df-t9d4b   1/1     Running   0             35d
metacontroller-0                                         1/1     Running   0             35d
metadata-envoy-deployment-85cc676d57-hbxqp               1/1     Running   0             35d
metadata-grpc-deployment-98fd89ff6-sglnc                 2/2     Running   2 (35d ago)   35d
metadata-writer-6859d4ffc6-2j5kb                         2/2     Running   0             34d
minio-66669fbd94-jtgqn                                   2/2     Running   0             34d
ml-pipeline-6675cd9b94-nfp74                             2/2     Running   2 (52m ago)   68m
ml-pipeline-persistenceagent-5bdc59674b-994s2            2/2     Running   0             68m
ml-pipeline-scheduledworkflow-5b47b9d5f5-ddr6f           2/2     Running   0             35d
ml-pipeline-ui-789c7b46cf-kg5w5                          2/2     Running   0             68m
ml-pipeline-viewer-crd-765d85855b-wt6c2                  2/2     Running   1 (35d ago)   35d
ml-pipeline-visualizationserver-6c49dff8dc-sdn7n         2/2     Running   0             35d
mysql-85d4f56c9b-pspjv                                   2/2     Running   0             33d
notebook-controller-deployment-7d4b968b5-4q9sj           2/2     Running   1 (35d ago)   35d
profiles-deployment-7bf5788f98-tmhgq                     3/3     Running   1 (35d ago)   35d
pvcviewer-controller-manager-666cf58bf8-xvtzk            3/3     Running   1 (35d ago)   35d
tensorboard-controller-deployment-844967d94b-jfn49       3/3     Running   1 (35d ago)   35d
tensorboards-web-app-deployment-7bff589c99-2dkh5         2/2     Running   0             35d
training-operator-754d664965-hxbvq                       1/1     Running   0             35d
volumes-web-app-deployment-68675f95d9-w6b86              2/2     Running   0             35d
workflow-controller-545cbd7ddb-8bh58                     2/2     Running   1 (35d ago)   35d

Steps to reproduce

  1. Open run details page
  2. Wait 30 secs and click any part of the run details page will occur the page be reloaded.

or

  1. Open run details page
  2. Go to other tab of browser or Go to another window(not browser)
  3. Back to run details page or run details page's tab, the page be reloaded.

I think the the only inner iframe would be refreshed.

Expected result

This is the only logs on inspect.

Screenshot 2024-03-20 at 12 06 25

Materials and Reference


Impacted by this bug? Give it a 👍.

If you need more config or variables, i will reply from my environment.

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

juliusvonkohout commented 2 months ago

not stale

droctothorpe commented 2 months ago

This same issue is impacting many of our end users unfortunately.

juliusvonkohout commented 2 months ago

/lifecycle frozen

InvisibleMan1306 commented 1 month ago

I've bisected this issue, it doesn't occur in gcr.io/ml-pipeline/frontend:2.0.0-beta.0 but it does occur in gcr.io/ml-pipeline/frontend:2.0.0-beta.1. I think the issue is somewhere in the https://github.com/kubeflow/pipelines/blob/2.0.0-beta.1/frontend/src/pages/RunDetailsV2.tsx code, will continue to dig.

Also latest gcr.io/ml-pipeline/frontend:2.2.0 does not fix this issue as of posting this.

InvisibleMan1306 commented 1 month ago

Would appreciate help from @jlyaoyuli on changes in https://github.com/kubeflow/pipelines/commit/ca2004ca6c35759bc2a1b8d47e7e77431166b2f7. My current intuition is that the hasFinishedV2() call is getting triggered on the onFocus event but is validating incorrectly causing a refresh. Assuming that the functionality from https://github.com/kubeflow/pipelines/blob/master/frontend/src/pages/RunDetails.tsx#L715 was moved to RunDetailsV2.tsx.

InvisibleMan1306 commented 1 month ago

OK I narrowed the issue down to commit for RunDetailsRouter.tsx, specifically commenting out the following lines fixes it:

  if (runIsFetching || templateStrIsFetching) {
    return <div>Currently loading recurring run information</div>;
  }

This return call triggers a componentWillUnmount() then in RunDetails.tsx defaults runFinished to false which causes a refresh in https://github.com/kubeflow/pipelines/blob/2.0.0/frontend/src/pages/RunDetails.tsx#L926-L929.

At this point I do not have a better fix, I feel like there isn't enough interest in fixing a backwards compatibility issue with V1.