kubeflow / fairing

Python SDK for building, training, and deploying ML models
Apache License 2.0
337 stars 144 forks source link

pip install can not stop and ImportError: cannot import name 'ServeRequest' from 'ray.serve.utils' in mnist e2e #565

Open 631068264 opened 2 years ago

631068264 commented 2 years ago

/kind bug

What steps did you take and what happened: just use pip install kubeflow-fairing image

keep installing for a long time and it try to install same package with different version.

finally I try this pip install kubeflow-fairing --use-deprecated=legacy-resolver

Then I run mnist e2e example py

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import uuid

import yaml
from kubeflow import fairing
from kubeflow.fairing.kubernetes.utils import mounting_pvc
from kubernetes import client as k8s_client
from kubernetes import config as k8s_config

DOCKER_REGISTRY = '10.19.64.203:8080'
my_namespace = 'kserve-test'

num_chief = 1  # number of Chief in TFJob
num_ps = 1  # number of PS in TFJob
num_workers = 2  # number of Worker in TFJob
model_dir = "/mnt"
export_path = "/mnt/export"
train_steps = "1000"
batch_size = "100"
learning_rate = "0.01"

pvc_name = 'mnist-pvc'
pvc_yaml = f'''
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: {pvc_name}
  namespace: {my_namespace}
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: ""
  resources:
    requests:
      storage: 10Gi
'''

k8s_config.load_kube_config()

k8s_core_api = k8s_client.CoreV1Api()
# k8s_core_api.create_persistent_volume(yaml.safe_load(pv_yaml))
k8s_core_api.create_namespaced_persistent_volume_claim(my_namespace, yaml.safe_load(pvc_yaml))

tfjob_name = f'mnist-training-{uuid.uuid4().hex[:4]}'

output_map = {
    "Dockerfile": "Dockerfile",
    "mnist.py": "mnist.py"
}

command = ["python",
           "/opt/mnist.py",
           "--tf-model-dir=" + model_dir,
           "--tf-export-dir=" + export_path,
           "--tf-train-steps=" + train_steps,
           "--tf-batch-size=" + batch_size,
           "--tf-learning-rate=" + learning_rate]

fairing.config.set_preprocessor('python', command=command, path_prefix="/app", output_map=output_map)
fairing.config.set_builder(name='docker', registry=DOCKER_REGISTRY,
                           image_name="mnist", dockerfile_path="Dockerfile")

fairing.config.set_deployer(name='tfjob', namespace=my_namespace, stream_log=False, job_name=tfjob_name,
                            chief_count=num_chief, worker_count=num_workers, ps_count=num_ps,
                            pod_spec_mutators=[mounting_pvc(pvc_name=pvc_name, pvc_mount_path=model_dir)])
fairing.config.run()

What did you expect to happen:

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

(.env) ➜  kubeflow git:(master) ✗ pip list |grep kube                                                     
kubeflow-fairing               1.0.2
kubeflow-pytorchjob            0.1.3
kubeflow-tfjob                 0.1.3
kubernetes                     10.0.1

NOTE: If you are using fair from master, please provide us the git commit hash.