ray-project / ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.03k stars 5.59k forks source link

[Ray CORE] Gitlab integration with authentication #46833

Open githuberj opened 1 month ago

githuberj commented 1 month ago

Description

We would like to utilize gitlab as our collaboration platform. We already utilize KubeRay and would like to enable users to execute RayJobs and RayServings in a secure fashion similar to the one provided for github in https://docs.ray.io/en/latest/ray-core/runtime_env_auth.html. We use a GitOps based approach and therefore would greatly appreciate this possibility.

I currently got it running through an initcotainer with SSH git-sync but I would greatly appreciate a nicer workflow.

I also found a few related issues:

Use case

A user of Kuberay can provide a valid authentication method (Personal access token, Group access token, Project access token, SSH key) and the operator and ray cluster can execute his code.

jjyao commented 1 month ago

Does the netrc approach work for gitpod?

githuberj commented 1 month ago

No unfortunately I didn't get gitlab to work with netrc.

jjyao commented 1 month ago

You can check out https://gitlab.com/gitlab-org/gitlab/-/issues/350582, seems there is a solution there

githuberj commented 1 month ago

I tried it but couldn't get it to work with kuberay so I went a step back and tried it on my local:

This code works

import ray

good_env = {"working_dir": "https://github.com/ray-project/serve_config_examples/archive/refs/heads/master.zip"}

# Specify a runtime environment for the entire Ray job
ray.init(runtime_env=good_env,logging_level="DEBUG")

# Create a Ray task, which inherits the above runtime env.
@ray.remote
def f():
    # The function will have its working directory changed to its node's
    # local copy of /tmp/runtime_env_working_dir.
    return open("text_ml.py").read()

print(ray.get(f.remote()))

ray.shutdown()

But if I put the same code into a password protected gitlab repo I only get a RayTaskError(FileNotFoundError): ray::f() (pid=15935, ip=172.26.113.45) exception.

There is also no indication of something going wrong with the runtime environment in the logs. I tried setting NETRC environment variable and using the default location of ~/.netrc

jjyao commented 1 month ago

Make sure you do chmod 600 ~/.netrc. Also can you try wget https://gitlab.com/xxx.zip first to make sure your netrc file is correct.