Closed ahuang11 closed 1 year ago
Ugh the classic standard library doesn't have a dedicated namespace issue. What do you mean re this workaround?
In ray.workers
module:
if mode == SCRIPT_MODE:
# Add the directory containing the script that is running to the Python
# paths of the workers. Also add the current directory. Note that this
# assumes that the directory structures on the machines in the clusters
# are the same.
# When using an interactive shell, there is no script directory.
if not interactive_mode:
script_directory = os.path.abspath(os.path.dirname(sys.argv[0]))
worker.run_function_on_all_workers(
lambda worker_info: sys.path.insert(1, script_directory)
)
The reason why this doesn't happen running flows locally vs deployments is because when Prefect runs deployed code, it executes the script through .../lib/python3.8/site-packages/prefect/engine.py
and the root directory is .../lib/python3.8/site-packages/prefect
where packaging also lives.
At the moment, working with the Ray OSS team to resolve this.
Really interesting. Let me know if you need anything.
How does engine.py
execute the Ray script?
python -m prefect.engine <UUID>
python -m prefect.engine <UUID>
is the entrypoint to Prefect flow runs. The flow run process then calls ray.init()
(see the RayTaskRunner
implementation here) and task runs are submitted to the Ray cluster.
The reason why we have this behavior is because https://docs.python.org/3/library/sys.html#sys.path
As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter. If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input), path[0] is the empty string, which directs Python to search modules in the current directory first. Notice that the script directory is inserted before the entries inserted as a result of [PYTHONPATH](https://docs.python.org/3/using/cmdline.html#envvar-PYTHONPATH).
Currently there is no easy way to disable this behavior besides the workaround Andrew mentioned.
I think https://blog.cykerway.com/posts/2020/12/15/python-name-conflict-with-built-in-module.html is essentially the problem we are facing here.
We're not running a script though, we're using python -m <module>
. I'm not sure why Ray needs to modify the path in this case.
For example:
# example.py
import sys
print(sys.path[0])
Displays the path of the script
❯ python src/prefect/example.py
/Users/mz/dev/prefect/src/prefect
Displays my current working directory
❯ python -m prefect.example
/Users/mz/dev/prefect
Oh, got it. Then I feel it's a bug on our side, we diverge from the default python behavior.
Could you check to see if https://github.com/ray-project/ray/pull/28043 fixes the issue?
Neat! It works! Thanks @jjyao for the quick fix!
10:27:39.736 | INFO | prefect.agent - Submitting flow run '6d01c2cd-0af1-4f55-91b6-9349b524e3f3'
10:27:40.393 | INFO | prefect.infrastructure.process - Opening process 'cornflower-puma'...
10:27:40.399 | INFO | prefect.agent - Completed submission of flow run '6d01c2cd-0af1-4f55-91b6-9349b524e3f3'
10:27:41.785 | WARNING | ray.worker - Failed to set SIGTERM handler, processes mightnot be cleaned up properly on exit.
10:27:42.029 | INFO | prefect.task_runner.ray - Creating a local Ray instance
2022-08-22 10:27:43,661 INFO services.py:1470 -- View the Ray dashboard at http://127.0.0.1:8265
10:27:45.034 | INFO | prefect.task_runner.ray - Using Ray cluster with 1 nodes.
10:27:45.034 | INFO | prefect.task_runner.ray - The Ray UI is available at 127.0.0.1:8265
10:27:46.189 | INFO | Flow run 'cornflower-puma' - Created task run 'say_hello-0b28e502-0' for task 'say_hello'
10:27:46.196 | INFO | Flow run 'cornflower-puma' - Submitted task run 'say_hello-0b28e502-0' for execution.
10:27:50.279 | INFO | Flow run 'cornflower-puma' - Finished in state Completed('All states completed.')
(begin_task_run pid=29639) hello Ford
10:27:50.447 | INFO | prefect.infrastructure.process - Process 'cornflower-puma' exited cleanly.
If possible, let us know when you merge and release so we can pin that specific version!
The PR is merged so it should be available in the next Ray release (2.1) or nightly wheel.
Exciting news! Will close this once ray 2.1 is released and we pin 2.1 in our requirements
2.1 has been released; closing this now.
The issue is that in a deployment, ray includes
/home/ray/anaconda3/lib/python3.8/site-packages/prefect/
in itssys.path
, and in this directory, there’s a packaging directory which confuses thefrom packaging import version
statement (it tries to pull from prefect.packaging rather than the pypi packaging).To workaround this issue, override
sys.argv[0] = str(Path.cwd())
inprefect_ray/task_runners.py