ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.82k stars 5.75k forks source link

[Autoscaler] Assigning None to optional keys leads to failure #28012

Open GuillaumeDesforges opened 2 years ago

GuillaumeDesforges commented 2 years ago

What happened + What you expected to happen

When calling ray.autoscaler.sdk.create_or_update_cluster:

  File "/dev/.venv/lib/python3.9/site-packages/ray/autoscaler/sdk/sdk.py", line 38, in create_or_update_cluster
    return commands.create_or_update_cluster(
  File "/dev/.venv/lib/python3.9/site-packages/ray/autoscaler/_private/commands.py", line 274, in create_or_update_cluster
    config = _bootstrap_config(config, no_config_cache=no_config_cache)
  File "/dev/.venv/lib/python3.9/site-packages/ray/autoscaler/_private/commands.py", line 295, in _bootstrap_config
    config = prepare_config(config)
  File "/dev/.venv/lib/python3.9/site-packages/ray/autoscaler/_private/util.py", line 216, in prepare_config
    merge_setup_commands(with_defaults)
  File "/dev/.venv/lib/python3.9/site-packages/ray/autoscaler/_private/util.py", line 312, in merge_setup_commands
    config["setup_commands"] + config["head_setup_commands"]
TypeError: unsupported operand type(s) for +: 'NoneType' and 'NoneType'

when there should be no TypeError

Versions / Dependencies

ray 1.13.0

Reproduction script

Issue Severity

Low: It annoys or frustrates me.

GuillaumeDesforges commented 2 years ago

Should be fixed by something like

diff --git a/python/ray/autoscaler/_private/util.py b/python/ray/autoscaler/_private/util.py
index f9e5dbf2a..3c23a7419 100644
--- a/python/ray/autoscaler/_private/util.py
+++ b/python/ray/autoscaler/_private/util.py
@@ -330,12 +330,15 @@ def merge_legacy_yaml_with_defaults(merged_config: Dict[str, Any]) -> Dict[str,

 def merge_setup_commands(config):
-    config["head_setup_commands"] = (
-        config["setup_commands"] + config["head_setup_commands"]
-    )
-    config["worker_setup_commands"] = (
-        config["setup_commands"] + config["worker_setup_commands"]
-    )
+    if config["setup_commands"] is not None:
+        if config["head_setup_commands"] is not None:
+            config["head_setup_commands"] = (
+                config["setup_commands"] + config["head_setup_commands"]
+            )
+        if config["worker_setup_commands"] is not None:
+            config["worker_setup_commands"] = (
+                config["setup_commands"] + config["worker_setup_commands"]
+            )
     return config
GuillaumeDesforges commented 2 years ago

In my case, I'm using pydantic to generate a Python model of the config from the schema in the ray repo, and passed .dict() result to create_or_update_cluster .

I worked around it using .dict(exclude_none=True)