ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
34.2k stars 5.81k forks source link

[runtime_env] Allow specifying existing `virtualenv` path in `pip` field #28460

Open architkulkarni opened 2 years ago

architkulkarni commented 2 years ago

Description

Allow specifying a local path to a preinstalled virtualenv that exists on all nodes: RuntimeEnv(pip="/path/to/env/").

This should activate the virtualenv for all Ray workers without doing any installation of packages. This is similar to how the conda field works for when providing the name of a conda environment that exists on all nodes.

Should be less than 100 lines of code including tests. In validation.py we need to distinguish between a requirements.txt file and a directory. Then in pip.py the create step can be mostly skipped as there is nothing to install, and in modify_context we can use _PathHelper.get_virtualenv_activate_command to get the appropriate activation command to run before starting each worker.

Use case

On a single Ray cluster, run tasks and actors that have conflicting Python package dependencies, with the added constraint that these packages are too large or complicated to install at runtime.

evanrittenhouse commented 2 years ago

I can take this on if it's still free!

architkulkarni commented 2 years ago

@evanrittenhouse That's great to hear, please let me know if you need any pointers!

evanrittenhouse commented 2 years ago

@architkulkarni Hi, first off wanted to say thanks for the detailed issue. Makes it very easy to make progress on a new code base :).

Secondly, I'm a little unclear on how to use _PathHelper.get_virtualenv_command() to use the passed virtualenv directory. It seems like python/ray/_private/runtime_env/pip.py#modify_context already uses that function, which seems like it already works with virtualenvs/target directories. Could you please elaborate on that last step?

architkulkarni commented 2 years ago

I see, you're saying maybe we really don't need to do anything for the last step? I think that makes sense, in that case we just need to make sure the user can pass in the virtualenv directory and then do the appropriate plumbing. Right now our API is that the user can only pass in a list of pip packages, or the path to a requirements.txt file.

evanrittenhouse commented 2 years ago

@architkulkarni Great! And I'm also a bit unsure where the tests for this should go. Can you please point me to the right file?

Appreciate all your help.

architkulkarni commented 2 years ago

Probably https://github.com/ray-project/ray/blob/master/python/ray/tests/test_runtime_env_validation.py and one of the test_runtime_env_* or test_runtime_env_conda_and_pip_* files, wherever you see other virtualenv-related tests. Sorry for the lack of structure here, we had to split up a lot of these tests to make them not time out.

evanrittenhouse commented 2 years ago

Sounds good! Hope to have a PR out soon - sorry, work's been insane this week

evanrittenhouse commented 2 years ago

@architkulkarni Sorry for the delay, coming back on this. How would I actually test this? Is it enough to simply pass a virtualenv to RuntimeEnv and check that it exists?

architkulkarni commented 1 year ago

Hi @evanrittenhouse, sorry to miss your message. You should be able to mimic what we do for the conda feature, where we have a test like https://github.com/ray-project/ray/blob/9cd4dd77495e2d5481db1face441016eba9a06fe/python/ray/tests/test_runtime_env_complicated.py#L162 (to pick a random example) which uses the pytest fixture conda_envs which installs temporary named conda environments: https://github.com/ray-project/ray/blob/9cd4dd77495e2d5481db1face441016eba9a06fe/python/ray/tests/test_runtime_env_complicated.py#L45

You could probably do something similar with virtualenv instead of conda.

antimonyGu commented 1 year ago

Hi. Since this issue has not been updated for a long time, can I continue to work on this issue? @architkulkarni

architkulkarni commented 1 year ago

Hi. Since this issue has not been updated for a long time, can I continue to work on this issue? @architkulkarni

@antimonyGu Sure! Feel free to tag me on your PR.