ray-project / ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.2k stars 5.61k forks source link

[Runtime Environment] Remove cached python libs, working dir etc #47488

Open sisilmehta733 opened 3 weeks ago

sisilmehta733 commented 3 weeks ago

What happened + What you expected to happen

Currently there is no way to

  1. Remove existing cached files from the runtime environment OR
  2. Launch a ray job without using any of the caching of the python libs, working dir etc.

Versions / Dependencies

We'r using ray 2.35.0

Reproduction script

NA

Issue Severity

Medium: It is a significant difficulty but I can work around it.

jjyao commented 2 weeks ago

Runtime env cached files are reference counted so they will be automatically deleted when no one is using it.

Launch a ray job without using any of the caching of the python libs, working dir etc.

If your Ray job doesn't have runtime env, then it won't use cached python libs and working dir from other jobs.

sisilmehta733 commented 2 weeks ago

@jjyao

  1. The behavior we'r hoping for is to keep the runtime environment (since we want to install dependencies). However we don't want to use the caching (some of our dependencies are internal libraries and without versioning etc).
  2. We also noticed that if you pass the requirements file as "pip": ["-r requirements.txt"] this file is cached even when the contents of the file change -> which is a bug.

Are there any env variables that can be set on the cluster or the job to disable this caching behavior?