kubeflow / testing

Test infrastructure and tooling for Kubeflow.
Apache License 2.0
63 stars 89 forks source link

Use python3 for run_e2e_workflow.py #684

Closed jlewi closed 4 years ago

jlewi commented 4 years ago

Currently our prow jobs are firing of run_e2e_workflow.py using python2 which is no longer supported.

We should switch to using python3.

Our prow jobs are configured in: https://github.com/kubernetes/test-infra/blob/5a71a6101fdca9159d7d6fd9e11ae8291c7fee35/config/jobs/kubeflow/kubeflow-presubmits.yaml

These basically specify a docker image to run

gcr.io/kubeflow-ci/test-worker:latest

Currently this image is using the python2 image https://github.com/kubeflow/testing/blob/master/images/Dockerfile

We do have a python3 docker image https://github.com/kubeflow/testing/blob/master/images/Dockerfile.py3

run_e2e_workflow.py might need some code changes though to be python3 compatible.

issue-label-bot[bot] commented 4 years ago

Issue-Label Bot is automatically applying the labels:

Label Probability
kind/feature 0.80
area/engprod 0.88

Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! Links: app homepage, dashboard and code for this bot.

scottilee commented 4 years ago

@jlewi

  1. Which test is running run_e2e_workflow.py?
  2. What's the process of getting a new image onto gcr.io/kubeflow-ci/ and should it be named something like gcr.io/kubeflow-ci/test-worker-py3:latest?
  3. Scanning through https://github.com/kubeflow/testing/blob/master/py/kubeflow/testing/run_e2e_workflow.py I don't see anything that needs to be updated for Python 3 but how would I verify to make sure? Meaning how do I run this workflow as a test?
  4. Are there any other files in that directory that you know about that need to be updated to Python 3?
jlewi commented 4 years ago

@scottilee we should already have a python 3 image built from this dockerfile https://github.com/kubeflow/testing/blob/master/images/Dockerfile.py3 via skaffold

These get posted to:gcr.io/kubeflow-ci/test-worker-py3 I tagged: gcr.io/kubeflow-ci/test-worker-py3@sha256:1b3e04157b4b27958a38b78e6f93c8741ce9a55f3bf2cbfaa50571734a6b0d06

As latest.

You can look at our Kubernetes test configs https://github.com/kubernetes/test-infra/blob/master/config/jobs/kubeflow/kubeflow-presubmits.yaml

to see which configs are using run_e2e_workflow.py; pretty much all the tests should be using it. We use prow to fire off that docker container which runs run_e2e_workflow.py to fire off the E2E tests.

Maybe try picking a repository to test with and updating it to use the python3 worker image?

jlewi commented 4 years ago

@scottilee lets start with kubeflow/testing; would you mind sending a PR my way to update the tests for kubeflow/testing?

jlewi commented 4 years ago

@scottilee any plan to work on this?

jlewi commented 4 years ago

@scottilee kubernetes/test-infra#18195 doesn't seem to be working. Here's a recent test https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/kubeflow_testing/724/kubeflow-testing-presubmit/1280976215678652416/

Nothing actually ran; the pod started but logs are empty.

Looks like no entrypoint is set https://github.com/kubeflow/testing/blob/master/images/Dockerfile.py3

compare to https://github.com/kubeflow/testing/blob/c37f4bb06e268022f41214d5995df7e738e9800b/images/Dockerfile#L209

issue-label-bot[bot] commented 4 years ago

Issue-Label Bot is automatically applying the labels:

Label Probability
area/testing 0.91

Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! Links: app homepage, dashboard and code for this bot.

jlewi commented 4 years ago

@scottilee ping?

scottilee commented 4 years ago

PR here with some questions: https://github.com/kubeflow/testing/pull/729.

jlewi commented 4 years ago

@scottilee rejoin the google group kubeflow-ci-team so you can build the image.

jlewi commented 4 years ago

@scottilee it looks like some of the refactoring you did broke the image. We should revert it.

In terms of automating the builds what we should do is create a tekton workflow that we run in presubmits and postsubmits. This would verify that the image builds correctly.

The tekton pipeline should be relatively simple; 1 task to run Kaniko to build the image. Some docs here: https://github.com/kubeflow/testing/blob/master/docs/tekton.md

jlewi commented 4 years ago

@scottilee based on the results in #732 it looks like we have some more work today. The tests are failing with

+ python -m kubeflow.testing.run_e2e_workflow --project=kubeflow-ci --zone=us-east1-d --cluster=kubeflow-testing --bucket=kubernetes-jenkins --config_file=/src/kubeflow/testing/prow_config.yaml --repos_dir=/src
/usr/bin/python: Error while finding module specification for 'kubeflow.testing.run_e2e_workflow' (ModuleNotFoundError: No module named 'kubeflow')

Looks like a python path issue.

jlewi commented 4 years ago

@scottilee Is this something you will be able to work on? The tests for kubeflow/testing are currently not working because of the error mentioned above.

/cc @pingsutw

scottilee commented 4 years ago

Sorry for the delay. I created a PR to undo it for now until I can figure out how to fix the py3 image: https://github.com/kubernetes/test-infra/pull/18353.

scottilee commented 4 years ago

I was able to successfully build the Dockerfile.py3 image with the PR https://github.com/kubeflow/testing/pull/738. Let me know what you think.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in one week if no further activity occurs. Thank you for your contributions.

stale[bot] commented 4 years ago

This issue has been closed due to inactivity.