opendatahub-io / odh-dashboard

Dashboard for ODH
Apache License 2.0
28 stars 166 forks source link

[Bug]: UI does not read properly data about created notebooks that results into Unknown and Deleted keywors in properly working notebooks #2269

Closed Frawless closed 11 months ago

Frawless commented 11 months ago

Is there an existing issue for this?

Deploy type

OpenDataHub core version (eg. v1.6.0)

Version

incubation (2.4.0)

Current Behavior

When I created Notebook via CR, it gets created and managed by Notebook controller and it is shown in UI. However, user can define specific resources and image in Notebook CR. This results into the following state:

image

This problem is also available also when User specify values generated by UI.

Expected Behavior

UI will not show red items when notebook is working properly.

Steps To Reproduce

  1. Deploy ODH from incubation branch
  2. Create DSC - https://github.com/skodjob/deployment-hub/blob/main/open-data-hub/install/data-science-cluster.yaml
  3. Create Notebook CR - https://github.com/skodjob/deployment-hub/blob/main/open-data-hub/resources/project-homelander/workbench.yaml#L21

Workaround (if any)

No response

What browsers are you seeing the problem on?

No response

Anything else

No response

shalberd commented 11 months ago

I assume you mean you created a CR outside the odh dashboard GUI ... Could well be that some dashboard-specific annotations and labels are missing. Compare your Notebook CR https://github.com/skodjob/deployment-hub/blob/main/open-data-hub/resources/project-homelander/workbench.yaml#L21

with one created via the odh dashboard data science projects New Workbench GUI. I am pretty sure you will find differences.

Regarding notebook image column:

You've got env var JUPYTER_IMAGE with the value .... jupyter-pytorch-notebook:2023.2 Do you have an imagestream in the main odh applications namespace opendatahub with that name jupyter-pytorch-notebook and a spec.tag 2023.2?

Regarding Container size column:

that is dependent on notebook metadata.annotations, last-size-selection, and not the actual resources (limits / requests) in the podspec containers.

    notebooks.opendatahub.io/last-image-selection: 'jupyter-pytorch-notebook:2023.2'
    notebooks.opendatahub.io/last-size-selection: Small

which needs to have its equivalent in the odhDashboardConfig notebookSizes section, which is static

https://github.com/opendatahub-io/odh-dashboard/blob/main/docs/dashboard-config.md

i.e. you cannot have all sorts of combinations in manually-created notebook CRs for resource limits and requests. You have to define then in dashboardConfig notebookSizes, have to the the last-size-selection annotation, and the resources get updated / set accordingly.

shalberd commented 11 months ago

@andrewballantyne @lucferbux

Frawless commented 11 months ago

@shalberd hey, I use Notebook mentioned above that is basically scraped from Notebook created by UI. The imagestream is available in the OCP, it is basically default configuration of ODH. I tried to set the annotations

notebooks.opendatahub.io/last-image-selection: 'jupyter-pytorch-notebook:2023.2'
notebooks.opendatahub.io/last-size-selection: Small

but without any change. Only change happen when I set it directly via UI, but it also results into Notebook rolling update.

The operator reconcile my Notebook, even with custom Resources so I would expect it is a valid configuration. So maybe there should be some warning in CRs? I didn't notice any.

Thanks for sharing the dashboard config info, this is new for me as well :)

shalberd commented 11 months ago

Can you share here your imagestream yaml jupyter-pytorch-notebook, please?

Frawless commented 11 months ago
kind: ImageStream
apiVersion: image.openshift.io/v1
metadata:
  annotations:
    opendatahub.io/notebook-image-order: '4'
    opendatahub.io/notebook-image-url: 'https://github.com/opendatahub-io/notebooks/blob/main/jupyter/pytorch'
    internal.config.kubernetes.io/previousNamespaces: default
    internal.config.kubernetes.io/previousKinds: ImageStream
    opendatahub.io/notebook-image-name: PyTorch
    internal.config.kubernetes.io/previousNames: jupyter-pytorch-notebook
    opendatahub.io/recommended-accelerators: '["nvidia.com/gpu"]'
    openshift.io/image.dockerRepositoryCheck: '2023-12-06T19:25:49Z'
    opendatahub.io/notebook-image-desc: >-
      Jupyter notebook image with PyTorch libraries and dependencies to start
      experimenting with advanced AI/ML notebooks.
  resourceVersion: '65570144'
  name: jupyter-pytorch-notebook
  uid: 8aac2ea6-1e34-413e-b627-e09c1c81bcd0
  creationTimestamp: '2023-12-06T19:25:47Z'
  generation: 2
  managedFields:
    - manager: default
      operation: Apply
      apiVersion: image.openshift.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:metadata':
          'f:annotations':
            'f:internal.config.kubernetes.io/previousKinds': {}
            'f:internal.config.kubernetes.io/previousNames': {}
            'f:internal.config.kubernetes.io/previousNamespaces': {}
            'f:opendatahub.io/notebook-image-desc': {}
            'f:opendatahub.io/notebook-image-name': {}
            'f:opendatahub.io/notebook-image-order': {}
            'f:opendatahub.io/notebook-image-url': {}
            'f:opendatahub.io/recommended-accelerators': {}
          'f:labels':
            'f:app.opendatahub.io/workbenches': {}
            'f:component.opendatahub.io/name': {}
            'f:opendatahub.io/component': {}
            'f:opendatahub.io/notebook-image': {}
        'f:spec':
          'f:lookupPolicy':
            'f:local': {}
          'f:tags':
            'k:{"name":"1.2"}':
              .: {}
              'f:annotations':
                'f:opendatahub.io/image-tag-outdated': {}
                'f:opendatahub.io/notebook-build-commit': {}
                'f:opendatahub.io/notebook-python-dependencies': {}
                'f:opendatahub.io/notebook-software': {}
                'f:openshift.io/imported-from': {}
              'f:from': {}
              'f:name': {}
              'f:referencePolicy':
                'f:type': {}
            'k:{"name":"2023.1"}':
              .: {}
              'f:annotations':
                'f:opendatahub.io/notebook-build-commit': {}
                'f:opendatahub.io/notebook-python-dependencies': {}
                'f:opendatahub.io/notebook-software': {}
                'f:opendatahub.io/workbench-image-recommended': {}
                'f:openshift.io/imported-from': {}
              'f:from': {}
              'f:name': {}
              'f:referencePolicy':
                'f:type': {}
            'k:{"name":"2023.2"}':
              .: {}
              'f:annotations':
                'f:opendatahub.io/notebook-build-commit': {}
                'f:opendatahub.io/notebook-python-dependencies': {}
                'f:opendatahub.io/notebook-software': {}
                'f:opendatahub.io/workbench-image-recommended': {}
                'f:openshift.io/imported-from': {}
              'f:from': {}
              'f:name': {}
              'f:referencePolicy':
                'f:type': {}
    - manager: manager
      operation: Update
      apiVersion: image.openshift.io/v1
      time: '2023-12-06T19:25:47Z'
      fieldsType: FieldsV1
      fieldsV1:
        'f:metadata':
          'f:annotations':
            'f:internal.config.kubernetes.io/previousNames': {}
            'f:opendatahub.io/recommended-accelerators': {}
            'f:opendatahub.io/notebook-image-name': {}
            'f:opendatahub.io/notebook-image-desc': {}
            .: {}
            'f:opendatahub.io/notebook-image-order': {}
            'f:opendatahub.io/notebook-image-url': {}
            'f:internal.config.kubernetes.io/previousKinds': {}
            'f:internal.config.kubernetes.io/previousNamespaces': {}
          'f:labels':
            .: {}
            'f:app.opendatahub.io/workbenches': {}
            'f:component.opendatahub.io/name': {}
            'f:opendatahub.io/component': {}
            'f:opendatahub.io/notebook-image': {}
          'f:ownerReferences':
            .: {}
            'k:{"uid":"2c0a7eee-5b99-4901-b19e-bfbf947c6bd8"}': {}
        'f:spec':
          'f:lookupPolicy':
            'f:local': {}
          'f:tags':
            .: {}
            'k:{"name":"1.2"}':
              .: {}
              'f:annotations':
                .: {}
                'f:opendatahub.io/image-tag-outdated': {}
                'f:opendatahub.io/notebook-build-commit': {}
                'f:opendatahub.io/notebook-python-dependencies': {}
                'f:opendatahub.io/notebook-software': {}
                'f:openshift.io/imported-from': {}
              'f:from': {}
              'f:generation': {}
              'f:importPolicy':
                .: {}
                'f:importMode': {}
              'f:name': {}
              'f:referencePolicy':
                .: {}
                'f:type': {}
            'k:{"name":"2023.1"}':
              .: {}
              'f:annotations':
                .: {}
                'f:opendatahub.io/notebook-build-commit': {}
                'f:opendatahub.io/notebook-python-dependencies': {}
                'f:opendatahub.io/notebook-software': {}
                'f:opendatahub.io/workbench-image-recommended': {}
                'f:openshift.io/imported-from': {}
              'f:from': {}
              'f:generation': {}
              'f:importPolicy':
                .: {}
                'f:importMode': {}
              'f:name': {}
              'f:referencePolicy':
                .: {}
                'f:type': {}
            'k:{"name":"2023.2"}':
              .: {}
              'f:annotations':
                .: {}
                'f:opendatahub.io/notebook-build-commit': {}
                'f:opendatahub.io/notebook-python-dependencies': {}
                'f:opendatahub.io/notebook-software': {}
                'f:opendatahub.io/workbench-image-recommended': {}
                'f:openshift.io/imported-from': {}
              'f:from': {}
              'f:generation': {}
              'f:importPolicy':
                .: {}
                'f:importMode': {}
              'f:name': {}
              'f:referencePolicy':
                .: {}
                'f:type': {}
  namespace: opendatahub
  ownerReferences:
    - apiVersion: datasciencecluster.opendatahub.io/v1
      kind: DataScienceCluster
      name: default
      uid: 2c0a7eee-5b99-4901-b19e-bfbf947c6bd8
      controller: true
      blockOwnerDeletion: true
  labels:
    app.opendatahub.io/workbenches: 'true'
    component.opendatahub.io/name: notebooks
    opendatahub.io/component: 'true'
    opendatahub.io/notebook-image: 'true'
spec:
  lookupPolicy:
    local: true
  tags:
    - name: '1.2'
      annotations:
        opendatahub.io/image-tag-outdated: 'true'
        opendatahub.io/notebook-build-commit: 4c8f26e
        opendatahub.io/notebook-python-dependencies: >-
          [{"name":"PyTorch","version":"1.8"},{"name":"Tensorboard","version":"2.6"},{"name":"Boto3","version":"1.17"},{"name":"Kafka-Python","version":"2.0"},{"name":"Matplotlib","version":"3.4"},{"name":"Numpy","version":"1.19"},{"name":"Pandas","version":"1.2"},{"name":"Scikit-learn","version":"0.24"},{"name":"Scipy","version":"1.6"}]
        opendatahub.io/notebook-software: >-
          [{"name":"CUDA","version":"11.4"},{"name":"Python","version":"v3.8"},{"name":"PyTorch","version":"1.8"}]
        openshift.io/imported-from: quay.io/opendatahub/notebooks
      from:
        kind: DockerImage
        name: >-
          quay.io/opendatahub/notebooks@sha256:94c5d01b19a0f30c0ca18153c50f18317f42c224e82321ef39c43116e7184731
      generation: 2
      importPolicy:
        importMode: Legacy
      referencePolicy:
        type: Source
    - name: '2023.1'
      annotations:
        opendatahub.io/notebook-build-commit: 17c2e49
        opendatahub.io/notebook-python-dependencies: >-
          [{"name":"PyTorch","version":"1.13"},{"name":"Tensorboard","version":"2.11"},{"name":"Boto3","version":"1.26"},{"name":"Kafka-Python","version":"2.0"},{"name":"Kfp-tekton","version":"1.5"},{"name":"Matplotlib","version":"3.6"},{"name":"Numpy","version":"1.24"},{"name":"Pandas","version":"1.5"},{"name":"Scikit-learn","version":"1.2"},{"name":"Scipy","version":"1.10"},{"name":"Elyra","version":"3.15"}]
        opendatahub.io/notebook-software: >-
          [{"name":"CUDA","version":"11.8"},{"name":"Python","version":"v3.9"},{"name":"PyTorch","version":"1.13"}]
        opendatahub.io/workbench-image-recommended: 'false'
        openshift.io/imported-from: quay.io/opendatahub/workbench-images
      from:
        kind: DockerImage
        name: >-
          quay.io/opendatahub/workbench-images@sha256:cf24bd469c283aeeeffa4ff3771ee10219f4446c4afef5f9d4c6c84c54bd81ce
      generation: 2
      importPolicy:
        importMode: Legacy
      referencePolicy:
        type: Source
    - name: '2023.2'
      annotations:
        opendatahub.io/notebook-build-commit: cf1b63e
        opendatahub.io/notebook-python-dependencies: >-
          [{"name":"PyTorch","version":"2.0"},{"name":"Tensorboard","version":"2.13"},{"name":"Boto3","version":"1.28"},{"name":"Kafka-Python","version":"2.0"},{"name":"Kfp-tekton","version":"1.5"},{"name":"Matplotlib","version":"3.6"},{"name":"Numpy","version":"1.24"},{"name":"Pandas","version":"1.5"},{"name":"Scikit-learn","version":"1.3"},{"name":"Scipy","version":"1.11"},{"name":"Elyra","version":"3.15"},{"name":"PyMongo","version":"4.5"},{"name":"Pyodbc","version":"4.0"},
          {"name":"Codeflare-SDK","version":"0.12"},
          {"name":"Sklearn-onnx","version":"1.15"},
          {"name":"Psycopg","version":"3.1"}, {"name":"MySQL
          Connector/Python","version":"8.0"}]
        opendatahub.io/notebook-software: >-
          [{"name":"CUDA","version":"11.8"},{"name":"Python","version":"v3.9"},{"name":"PyTorch","version":"2.0"}]
        opendatahub.io/workbench-image-recommended: 'true'
        openshift.io/imported-from: quay.io/opendatahub/workbench-images
      from:
        kind: DockerImage
        name: >-
          quay.io/opendatahub/workbench-images@sha256:3881889e511bde525d560b7dbbd655ea7586d7bed89502d1a4ce55ac24866ab1
      generation: 2
      importPolicy:
        importMode: Legacy
      referencePolicy:
        type: Source
status:
  dockerImageRepository: >-
    image-registry.openshift-image-registry.svc:5000/opendatahub/jupyter-pytorch-notebook
  tags:
    - tag: '1.2'
      items:
        - created: '2023-12-06T19:25:49Z'
          dockerImageReference: >-
            quay.io/opendatahub/notebooks@sha256:94c5d01b19a0f30c0ca18153c50f18317f42c224e82321ef39c43116e7184731
          image: >-
            sha256:94c5d01b19a0f30c0ca18153c50f18317f42c224e82321ef39c43116e7184731
          generation: 2
    - tag: '2023.1'
      items:
        - created: '2023-12-06T19:25:49Z'
          dockerImageReference: >-
            quay.io/opendatahub/workbench-images@sha256:cf24bd469c283aeeeffa4ff3771ee10219f4446c4afef5f9d4c6c84c54bd81ce
          image: >-
            sha256:cf24bd469c283aeeeffa4ff3771ee10219f4446c4afef5f9d4c6c84c54bd81ce
          generation: 2
    - tag: '2023.2'
      items:
        - created: '2023-12-06T19:25:49Z'
          dockerImageReference: >-
            quay.io/opendatahub/workbench-images@sha256:3881889e511bde525d560b7dbbd655ea7586d7bed89502d1a4ce55ac24866ab1
          image: >-
            sha256:3881889e511bde525d560b7dbbd655ea7586d7bed89502d1a4ce55ac24866ab1
          generation: 2

It was generated during ODH installation.

andrewballantyne commented 11 months ago

We will need to migrate this to Jira -- but I bumped the priority of this issue as I think this is a good bug to fix asap. We overreached with our detection logic and it fell into a "bad state" and assumed the image stream was deleted (which is not in this case)

kornys commented 11 months ago

@andrewballantyne could you please paste there a jira link?

adnankhan666 commented 11 months ago

https://issues.redhat.com/browse/RHOAIENG-822