canonical / bundle-kubeflow

Charmed Kubeflow
Apache License 2.0
103 stars 50 forks source link

Define a test plan for airgapped deployment of CKF #898

Closed NohaIhab closed 3 months ago

NohaIhab commented 4 months ago

Context

Currently, there are no defined tests for airgapped deployment. We need to define and document testing CKF functionality in an airgapped environment.

What needs to get done

  1. Look into the feasible tests in an airgapped environment
  2. Define and document the set of tests to be run

Definition of Done

We have a test plan for airgapped

syncronize-issues-to-jira[bot] commented 4 months ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5721.

This message was autogenerated

NohaIhab commented 4 months ago

To define a test plan in airgapped, we need to cover the following components - similar to our UATs:

Each of the tests might need some prerequisites to run it in an airgapped environment. The prerequisites and testing instructions will be documented in a doc under the team's gdrive folder.

NohaIhab commented 4 months ago

Pipelines

the UATs for Pipelines currently have 2 known limitations to be able to run on airgapped:

  1. before running the test, kfp python packages is installed with pip
  2. python:3.7 is used as the base image for the pipeline components by default

Proposed solution

For 1., we can:

  1. create a requirements.in with kfp>=2.4,<3.0 (to be compatible with v2.0.5)
  2. run pip-compile requirements.in to generate a requirements.txt with kfp package and all its dependencies
  3. In the host machine that has internet connection, pip download -r requirements.txt - this will download the wheels of the packages without installing them
  4. tar the wheels of the download packages into one file and move the wheels file and the requirements.txt into the airgapped machine
  5. From the notebook, upload the wheel file and requirements.txt, unzip the file into a /dependencies dir, and run:
    pip install -r requirements.txt --no-index --find-links "./dependencies"

For 2., we can: Set the base_image using kfp's dsl.component decorator to the Python image existing in the airgapped environment's local registry so we don't have to import another image. It's the Python image used by kfp-profile-controller, so we would set the base_image attribute to 172.17.0.2:5000/python:3.11.9-alpine

Update 1

After running the above proposed steps to address 1., I'm getting this error inside the notebook when trying to pip install:

ERROR: Could not find a version that satisfies the requirement charset-normalizer==3.3.2 (from versions: none)
ERROR: No matching distribution found for charset-normalizer==3.3.2

I think this is due to incompatibility of the Python versions between the host where the requirements where compiled, and the notebook's environment. From the notebook:

(base) jovyan@test-kfp-0:~$ python --version
Python 3.11.6
(base) jovyan@test-kfp-0:~$ pip --version
pip 23.3.1 from /opt/conda/lib/python3.11/site-packages/pip (python 3.11)

Meanwhile, the host is using python 3.10 (ubuntu 22.04 default) I'll try to recompile the requirements.in with the same python version, and repeat the process.

Update 2

I've tested with setting the base_image to 172.17.0.2:5000/python:3.11.9-alpine, however, the run failed with the error that it cannot find kfp dependency. This means that the pipeline executor image itself needs to have kfp installed. To utilize the existing images that do have kfp installed, I tried using the jupyter-tensorflow-full image (see reference that it has kfp SDK). This failed with the error:

unable to create directory: permission denied

Turns out this is a known issue in kubeflow pipelines: https://github.com/kubeflow/pipelines/issues/10397 This means that non-root users don't have access to the paths where it should create the artifacts, and the jupyter-tensorflow-full image runs as non-root

Final resort

To overcome the aforementioned issues, we've resorted to creating our own image which will just be a python base with the kfp package. This image will be published to charmedkubeflow registry and used for testing in airgapped. The Dockerfile for the image will live in the bundle-kubeflow repo under tests/airgapped/pipelines .

NohaIhab commented 4 months ago

This task is getting too broad, including the testing of multiple components. Discussed with @DnPlas that we will break this down into a task for every component that we need a test for.