Closed can-anyscale closed 2 months ago
CI test windows://python/ray/tests:test_implicit_resource is flaky. Recent failures:
DataCaseName-windows://python/ray/tests:test_implicit_resource-END Managed by OSS Test Policy
Test passed on latest run: https://buildkite.com/ray-project/postmerge/builds/3472#018e365f-7be5-4162-949d-9b68da190656
CI test windows://python/ray/tests:test_implicit_resource is flaky. Recent failures:
DataCaseName-windows://python/ray/tests:test_implicit_resource-END Managed by OSS Test Policy
Test passed on latest run: https://buildkite.com/ray-project/postmerge/builds/3580#018e4ff8-4d8b-4a2a-8cc0-ff02516b5248
CI test windows://python/ray/tests:test_implicit_resource is flaky. Recent failures:
DataCaseName-windows://python/ray/tests:test_implicit_resource-END Managed by OSS Test Policy
Test passed on latest run: https://buildkite.com/ray-project/postmerge/builds/3676#018e6340-753a-4637-a037-de1132a8d75b
CI test windows://python/ray/tests:test_implicit_resource is consistently_failing. Recent failures:
DataCaseName-windows://python/ray/tests:test_implicit_resource-END Managed by OSS Test Policy
Test passed on latest run: https://buildkite.com/ray-project/postmerge/builds/3727#018e7b48-5de8-40cf-a8f8-71efe8fd88de
CI test windows://python/ray/tests:test_implicit_resource is flaky. Recent failures:
DataCaseName-windows://python/ray/tests:test_implicit_resource-END Managed by OSS Test Policy
Test passed on latest run: https://buildkite.com/ray-project/postmerge/builds/3788#018e9a2e-c1b8-4716-ba2d-9e506871cd38
CI test windows://python/ray/tests:test_implicit_resource is flaky. Recent failures:
DataCaseName-windows://python/ray/tests:test_implicit_resource-END Managed by OSS Test Policy
Test passed on latest run: https://buildkite.com/ray-project/postmerge/builds/4026#018ee7c3-5f0c-4255-8221-ce496a087c45
CI test windows://python/ray/tests:test_implicit_resource is consistently_failing. Recent failures:
DataCaseName-windows://python/ray/tests:test_implicit_resource-END Managed by OSS Test Policy
Is there a way for me to access the buildkite.com pages? I think I need permissions.
The test does not show many failures on https://flaky-tests.ray.io/
CI test windows://python/ray/tests:test_implicit_resource is consistently_failing. Recent failures:
DataCaseName-windows://python/ray/tests:test_implicit_resource-END Managed by OSS Test Policy
Test passed on latest run: https://buildkite.com/ray-project/postmerge/builds/5074#01903e9c-9e28-48c5-ae5c-5fc161608b50
CI test windows://python/ray/tests:test_implicit_resource is flaky. Recent failures:
DataCaseName-windows://python/ray/tests:test_implicit_resource-END Managed by OSS Test Policy
Is there a way to see the logs? I don't seem to be able to view the build-kite links.
@mattip got you, yes, that pipeline is private; here is the log; or you can run in the a PR, the PR pipeline is public
[2024-06-11T18:14:50Z] ================================================================================
--
| [2024-06-11T18:14:50Z] ==================== Test output for //python/ray/tests:test_implicit_resource:
| [2024-06-11T18:14:50Z] ============================= test session starts =============================
| [2024-06-11T18:14:50Z] platform win32 -- Python 3.9.7, pytest-7.0.1, pluggy-1.3.0 -- C:\Miniconda3\python.exe
| [2024-06-11T18:14:50Z] cachedir: .pytest_cache
| [2024-06-11T18:14:50Z] rootdir: C:\Users\ContainerAdministrator\AppData\Local\Temp\Bazel.runfiles_t6unxcsg\runfiles\com_github_ray_project_ray
| [2024-06-11T18:14:50Z] plugins: anyio-3.7.1, asyncio-0.16.0, docker-tools-3.1.3, forked-1.4.0, httpserver-1.0.6, lazy-fixture-0.6.3, rerunfailures-11.1.2, shutil-1.7.0, sphinx-0.5.1.dev0, sugar-0.9.5, timeout-2.1.0, virtualenv-1.7.0
| [2024-06-11T18:14:50Z] collecting ... collected 3 items
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] python/ray/tests/test_implicit_resource.py::test_implicit_resource 2024-06-11 18:04:55,237 INFO worker.py:1761 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265
| Error creating PyTest summary | 0s
| [2024-06-11T18:14:50Z] [Errno 2] No such file or directory: 'C:/artifact-mount/test-summaries\\python/ray/tests/test_implicit_resource.py$$test_implicit_resource.txt'
| [2024-06-11T18:14:50Z] FAILED
| [2024-06-11T18:14:50Z] python/ray/tests/test_implicit_resource.py::test_implicit_resource_autoscaling[v1] 2024-06-11 18:05:02,913 - INFO - NumExpr defaulting to 1 threads.
| [2024-06-11T18:14:50Z] Did not find any active Ray processes.
| [2024-06-11T18:14:50Z] 2024-06-11 18:05:05,205 - INFO - NumExpr defaulting to 1 threads.
| [2024-06-11T18:14:50Z] Usage stats collection is disabled.
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] Local node IP: 172.30.241.182
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] --------------------
| [2024-06-11T18:14:50Z] Ray runtime started.
| [2024-06-11T18:14:50Z] --------------------
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] Next steps
| [2024-06-11T18:14:50Z] To add another node to this Ray cluster, run
| [2024-06-11T18:14:50Z] RAY_ENABLE_WINDOWS_OR_OSX_CLUSTER=1 ray start --address='172.30.241.182:6379'
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] To connect to this Ray cluster:
| [2024-06-11T18:14:50Z] import ray
| [2024-06-11T18:14:50Z] ray.init()
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] To submit a Ray job using the Ray Jobs CLI:
| [2024-06-11T18:14:50Z] RAY_ADDRESS='http://127.0.0.1:8265' ray job submit --working-dir . -- python my_script.py
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] See https://docs.ray.io/en/latest/cluster/running-applications/job-submission/index.html
| [2024-06-11T18:14:50Z] for more information on submitting Ray jobs to the Ray cluster.
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] To terminate the Ray runtime, run
| [2024-06-11T18:14:50Z] ray stop
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] To view the status of the cluster, use
| [2024-06-11T18:14:50Z] ray status
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] To monitor and debug Ray, view the dashboard at
| [2024-06-11T18:14:50Z] 127.0.0.1:8265
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] If connection to the dashboard fails, check your firewall settings and network configuration.
| [2024-06-11T18:14:50Z] 2024-06-11 18:05:08,230 INFO worker.py:1585 -- Connecting to existing Ray cluster at address: 172.30.241.182:6379...
| [2024-06-11T18:14:50Z] 2024-06-11 18:05:08,245 INFO worker.py:1761 -- Connected to Ray cluster. View the dashboard at 127.0.0.1:8265
| [2024-06-11T18:14:50Z] (raylet) [2024-06-11 18:05:12,027 C 10256 14764] (raylet.exe) dlmalloc.cc:129: Check failed: *handle != nullptr CreateFileMapping() failed. GetLastError() = 1450
| [2024-06-11T18:14:50Z] (raylet) *** StackTrace Information ***
| [2024-06-11T18:14:50Z] (raylet) unknown
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet) [2024-06-11 18:06:13,109 C 15980 7336] (raylet.exe) dlmalloc.cc:129: Check failed: *handle != nullptr CreateFileMapping() failed. GetLastError() = 1450 [repeated 4x across cluster] (Ray deduplicates logs by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/master/ray-observability/user-guides/configure-logging.html#log-deduplication for more options.)
| [2024-06-11T18:14:50Z] (raylet) *** StackTrace Information *** [repeated 4x across cluster]
| [2024-06-11T18:14:50Z] (raylet) unknown [repeated 59x across cluster]
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet) [2024-06-11 18:05:12,027 C 10256 14764] (raylet.exe) dlmalloc.cc:129: Check failed: *handle != nullptr CreateFileMapping() failed. GetLastError() = 1450 [repeated 6x across cluster]
| [2024-06-11T18:14:50Z] (raylet) *** StackTrace Information *** [repeated 6x across cluster]
| [2024-06-11T18:14:50Z] (raylet) unknown [repeated 72x across cluster]
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet) [2024-06-11 18:08:16,404 C 15868 7964] (raylet.exe) dlmalloc.cc:129: Check failed: *handle != nullptr CreateFileMapping() failed. GetLastError() = 1450 [repeated 11x across cluster]
| [2024-06-11T18:14:50Z] (raylet) *** StackTrace Information *** [repeated 11x across cluster]
| [2024-06-11T18:14:50Z] (raylet) unknown [repeated 132x across cluster]
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet) [2024-06-11 18:09:17,560 C 6788 16808] (raylet.exe) dlmalloc.cc:129: Check failed: *handle != nullptr CreateFileMapping() failed. GetLastError() = 1450 [repeated 11x across cluster]
| [2024-06-11T18:14:50Z] (raylet) *** StackTrace Information *** [repeated 11x across cluster]
| [2024-06-11T18:14:50Z] (raylet) unknown [repeated 132x across cluster]
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] ================================================================================
| [2024-06-11T18:14:50Z] ==================== Test output for //python/ray/tests:test_implicit_resource:
| [2024-06-11T18:14:50Z] ============================= test session starts =============================
| [2024-06-11T18:14:50Z] platform win32 -- Python 3.9.7, pytest-7.0.1, pluggy-1.3.0 -- C:\Miniconda3\python.exe
| [2024-06-11T18:14:50Z] cachedir: .pytest_cache
| [2024-06-11T18:14:50Z] rootdir: C:\Users\ContainerAdministrator\AppData\Local\Temp\Bazel.runfiles_1yvbyj79\runfiles\com_github_ray_project_ray
| [2024-06-11T18:14:50Z] plugins: anyio-3.7.1, asyncio-0.16.0, docker-tools-3.1.3, forked-1.4.0, httpserver-1.0.6, lazy-fixture-0.6.3, rerunfailures-11.1.2, shutil-1.7.0, sphinx-0.5.1.dev0, sugar-0.9.5, timeout-2.1.0, virtualenv-1.7.0
| [2024-06-11T18:14:50Z] collecting ... collected 3 items
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] python/ray/tests/test_implicit_resource.py::test_implicit_resource 2024-06-11 18:09:56,404 INFO worker.py:1761 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265
| Error creating PyTest summary | 3m 12s
| [2024-06-11T18:14:50Z] [Errno 2] No such file or directory: 'C:/artifact-mount/test-summaries\\python/ray/tests/test_implicit_resource.py$$test_implicit_resource.txt'
| [2024-06-11T18:14:50Z] FAILED
| [2024-06-11T18:14:50Z] python/ray/tests/test_implicit_resource.py::test_implicit_resource_autoscaling[v1] 2024-06-11 18:10:07,693 - INFO - NumExpr defaulting to 1 threads.
| [2024-06-11T18:14:50Z] Did not find any active Ray processes.
| [2024-06-11T18:14:50Z] 2024-06-11 18:10:10,307 - INFO - NumExpr defaulting to 1 threads.
| [2024-06-11T18:14:50Z] Usage stats collection is disabled.
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] Local node IP: 172.30.241.182
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] --------------------
| [2024-06-11T18:14:50Z] Ray runtime started.
| [2024-06-11T18:14:50Z] --------------------
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] Next steps
| [2024-06-11T18:14:50Z] To add another node to this Ray cluster, run
| [2024-06-11T18:14:50Z] RAY_ENABLE_WINDOWS_OR_OSX_CLUSTER=1 ray start --address='172.30.241.182:6379'
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] To connect to this Ray cluster:
| [2024-06-11T18:14:50Z] import ray
| [2024-06-11T18:14:50Z] ray.init()
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] To submit a Ray job using the Ray Jobs CLI:
| [2024-06-11T18:14:50Z] RAY_ADDRESS='http://127.0.0.1:8265' ray job submit --working-dir . -- python my_script.py
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] See https://docs.ray.io/en/latest/cluster/running-applications/job-submission/index.html
| [2024-06-11T18:14:50Z] for more information on submitting Ray jobs to the Ray cluster.
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] To terminate the Ray runtime, run
| [2024-06-11T18:14:50Z] ray stop
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] To view the status of the cluster, use
| [2024-06-11T18:14:50Z] ray status
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] To monitor and debug Ray, view the dashboard at
| [2024-06-11T18:14:50Z] 127.0.0.1:8265
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] If connection to the dashboard fails, check your firewall settings and network configuration.
| [2024-06-11T18:14:50Z] 2024-06-11 18:10:13,091 INFO worker.py:1585 -- Connecting to existing Ray cluster at address: 172.30.241.182:6379...
| [2024-06-11T18:14:50Z] 2024-06-11 18:10:13,107 INFO worker.py:1761 -- Connected to Ray cluster. View the dashboard at 127.0.0.1:8265
| [2024-06-11T18:14:50Z] (raylet) [2024-06-11 18:10:16,019 C 8976 14668] (raylet.exe) dlmalloc.cc:129: Check failed: *handle != nullptr CreateFileMapping() failed. GetLastError() = 1450
| [2024-06-11T18:14:50Z] (raylet) *** StackTrace Information ***
| [2024-06-11T18:14:50Z] (raylet) unknown
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet) [2024-06-11 18:10:16,019 C 8976 14668] (raylet.exe) dlmalloc.cc:129: Check failed: *handle != nullptr CreateFileMapping() failed. GetLastError() = 1450 [repeated 4x across cluster] (Ray deduplicates logs by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/master/ray-observability/user-guides/configure-logging.html#log-deduplication for more options.)
| [2024-06-11T18:14:50Z] (raylet) *** StackTrace Information *** [repeated 4x across cluster]
| [2024-06-11T18:14:50Z] (raylet) unknown [repeated 59x across cluster]
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet) [2024-06-11 18:10:16,019 C 8976 14668] (raylet.exe) dlmalloc.cc:129: Check failed: *handle != nullptr CreateFileMapping() failed. GetLastError() = 1450 [repeated 8x across cluster]
| [2024-06-11T18:14:50Z] (raylet) *** StackTrace Information *** [repeated 8x across cluster]
| [2024-06-11T18:14:50Z] (raylet) unknown [repeated 96x across cluster]
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] 2024-06-11 18:12:21,735 - INFO - NumExpr defaulting to 1 threads.
| Stopped all 35 Ray processes.opped.
| [2024-06-11T18:14:50Z] (autoscaler +22s) Tip: use `ray status` to view detailed cluster status. To disable these messages, set RAY_SCHEDULER_EVENTS=0.
| [2024-06-11T18:14:50Z] (autoscaler +22s) Resized to 0 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +22s) Adding 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +22s) Resized to 8 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +22s) Resized to 0 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +22s) Adding 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +22s) Resized to 8 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +24s) Adding 3 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +24s) Resized to 16 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +24s) Adding 3 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +24s) Resized to 16 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +24s) Resized to 0 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +24s) Adding 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +24s) Resized to 8 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +24s) Adding 3 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +24s) Resized to 16 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +25s) Resized to 0 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +25s) Adding 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +25s) Resized to 8 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +25s) Adding 3 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +25s) Resized to 16 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +1m25s) Failed to launch 2 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +1m25s) Failed to launch 2 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +1m25s) Failed to launch 2 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +1m25s) Failed to launch 2 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +1m26s) Resized to 0 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +1m26s) Adding 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +1m26s) Resized to 8 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +1m26s) Adding 3 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +1m26s) Resized to 16 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +1m26s) Failed to launch 2 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +1m26s) Adding 2 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +1m26s) Resized to 24 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +1m26s) Adding 2 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +1m26s) Resized to 24 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +1m26s) Adding 2 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +1m26s) Resized to 24 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +1m26s) Adding 2 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +1m26s) Resized to 24 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +1m26s) Adding 2 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +1m26s) Resized to 24 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +1m27s) Resized to 0 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +1m27s) Adding 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +1m27s) Resized to 8 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +1m27s) Adding 3 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +1m27s) Resized to 16 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +1m27s) Failed to launch 2 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +1m27s) Adding 2 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +1m27s) Resized to 24 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +2m27s) Failed to launch 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +2m27s) Failed to launch 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +2m27s) Failed to launch 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +2m27s) Failed to launch 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +2m27s) Failed to launch 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +2m27s) Failed to launch 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Resized to 0 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Adding 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Resized to 8 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Adding 3 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Resized to 16 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Failed to launch 2 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Adding 2 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Resized to 24 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Failed to launch 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Adding 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Resized to 32 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Adding 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Resized to 32 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Adding 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Resized to 32 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Adding 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Resized to 32 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Adding 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Resized to 32 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Adding 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Resized to 32 CPUs.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Adding 1 node(s) of type cpu_node.
| [2024-06-11T18:14:50Z] (autoscaler +2m28s) Resized to 32 CPUs.
| [2024-06-11T18:14:50Z] PASSED(raylet) [2024-06-11 18:11:18,181 C 2724 19236] (raylet.exe) dlmalloc.cc:129: Check failed: *handle != nullptr CreateFileMapping() failed. GetLastError() = 1450
| [2024-06-11T18:14:50Z] (raylet) *** StackTrace Information ***
| [2024-06-11T18:14:50Z] (raylet) unknown [repeated 12x across cluster]
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] python/ray/tests/test_implicit_resource.py::test_implicit_resource_autoscaling[v2] 2024-06-11 18:12:25,105 - INFO - NumExpr defaulting to 1 threads.
| [2024-06-11T18:14:50Z] Did not find any active Ray processes.
| [2024-06-11T18:14:50Z] 2024-06-11 18:12:27,040 - INFO - NumExpr defaulting to 1 threads.
| [2024-06-11T18:14:50Z] Usage stats collection is disabled.
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] Local node IP: 172.30.241.182
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] --------------------
| [2024-06-11T18:14:50Z] Ray runtime started.
| [2024-06-11T18:14:50Z] --------------------
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] Next steps
| [2024-06-11T18:14:50Z] To add another node to this Ray cluster, run
| [2024-06-11T18:14:50Z] RAY_ENABLE_WINDOWS_OR_OSX_CLUSTER=1 ray start --address='172.30.241.182:6379'
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] To connect to this Ray cluster:
| [2024-06-11T18:14:50Z] import ray
| [2024-06-11T18:14:50Z] ray.init()
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] To submit a Ray job using the Ray Jobs CLI:
| [2024-06-11T18:14:50Z] RAY_ADDRESS='http://127.0.0.1:8265' ray job submit --working-dir . -- python my_script.py
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] See https://docs.ray.io/en/latest/cluster/running-applications/job-submission/index.html
| [2024-06-11T18:14:50Z] for more information on submitting Ray jobs to the Ray cluster.
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] To terminate the Ray runtime, run
| [2024-06-11T18:14:50Z] ray stop
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] To view the status of the cluster, use
| [2024-06-11T18:14:50Z] ray status
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] To monitor and debug Ray, view the dashboard at
| [2024-06-11T18:14:50Z] 127.0.0.1:8265
| [2024-06-11T18:14:50Z]
| [2024-06-11T18:14:50Z] If connection to the dashboard fails, check your firewall settings and network configuration.
| [2024-06-11T18:14:50Z] 2024-06-11 18:12:29,768 INFO worker.py:1585 -- Connecting to existing Ray cluster at address: 172.30.241.182:6379...
| [2024-06-11T18:14:50Z] 2024-06-11 18:12:29,794 INFO worker.py:1761 -- Connected to Ray cluster. View the dashboard at 127.0.0.1:8265
| [2024-06-11T18:14:50Z] (raylet) [2024-06-11 18:12:31,778 C 7360 11000] (raylet.exe) dlmalloc.cc:129: Check failed: *handle != nullptr CreateFileMapping() failed. GetLastError() = 1450
| [2024-06-11T18:14:50Z] (raylet) *** StackTrace Information ***
| [2024-06-11T18:14:50Z] (raylet) unknown
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet) [2024-06-11 18:12:31,778 C 7360 11000] (raylet.exe) dlmalloc.cc:129: Check failed: *handle != nullptr CreateFileMapping() failed. GetLastError() = 1450 [repeated 3x across cluster]
| [2024-06-11T18:14:50Z] (raylet) *** StackTrace Information *** [repeated 3x across cluster]
| [2024-06-11T18:14:50Z] (raylet) unknown [repeated 47x across cluster]
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] (raylet) [2024-06-11 18:12:31,778 C 7360 11000] (raylet.exe) dlmalloc.cc:129: Check failed: *handle != nullptr CreateFileMapping() failed. GetLastError() = 1450 [repeated 7x across cluster]
| [2024-06-11T18:14:50Z] (raylet) *** StackTrace Information *** [repeated 7x across cluster]
| [2024-06-11T18:14:50Z] (raylet) unknown [repeated 84x across cluster]
| [2024-06-11T18:14:50Z] (raylet)
| [2024-06-11T18:14:50Z] ================================================================================
Test passed on latest run: https://buildkite.com/ray-project/postmerge/builds/5096#01904c03-ecf0-42fe-b43c-952bf04803e3
CI test windows://python/ray/tests:test_implicit_resource is consistently_failing. Recent failures:
DataCaseName-windows://python/ray/tests:test_implicit_resource-END Managed by OSS Test Policy
Test passed on latest run: https://buildkite.com/ray-project/postmerge/builds/5177#01905f9a-5184-43aa-b737-3a39357eba9f
CI test windows://python/ray/tests:test_implicit_resource is flaky. Recent failures:
DataCaseName-windows://python/ray/tests:test_implicit_resource-END Managed by OSS Test Policy
Test passed on latest run: https://buildkite.com/ray-project/postmerge/builds/5195#01906f1a-9609-4261-a801-c83ade15ef6e
CI test windows://python/ray/tests:test_implicit_resource is consistently_failing. Recent failures:
DataCaseName-windows://python/ray/tests:test_implicit_resource-END Managed by OSS Test Policy
Test passed on latest run: https://buildkite.com/ray-project/postmerge/builds/5261#01907b12-8389-495b-9744-8c42d5c29607
Blamed commit: d14c95c5442a55f82d2349a446d3738d1b54b736 found by bisect job https://buildkite.com/ray-project/release-tests-bisect/builds/1293
Test passed on latest run: https://buildkite.com/ray-project/postmerge/builds/5268#01907e81-47ea-4ea6-b1b2-4158ac2829e7
CI test windows://python/ray/tests:test_implicit_resource is flaky. Recent failures:
DataCaseName-windows://python/ray/tests:test_implicit_resource-END Managed by OSS Test Policy
Test passed on latest run: https://buildkite.com/ray-project/postmerge/builds/5272#0190842a-4550-4d8a-b6c2-821ef904fca3
CI test windows://python/ray/tests:test_implicit_resource is consistently_failing. Recent failures:
DataCaseName-windows://python/ray/tests:test_implicit_resource-END Managed by OSS Test Policy
Test passed on latest run: https://buildkite.com/ray-project/postmerge/builds/5352#01909b0a-59ce-4617-bcac-aba3566e78c3
CI test windows://python/ray/tests:test_implicit_resource is flaky. Recent failures:
DataCaseName-windows://python/ray/tests:test_implicit_resource-END Managed by OSS Test Policy
Test passed on latest run: https://buildkite.com/ray-project/postmerge/builds/5378#01909f1f-9282-465d-8bbc-d31ab8ab93a3
CI test windows://python/ray/tests:test_implicit_resource is consistently_failing. Recent failures:
DataCaseName-windows://python/ray/tests:test_implicit_resource-END Managed by OSS Test Policy
ehh, has been flaky since forever
This test is now considered as flaky because it has been failing on postmerge for too long. Flaky tests do not run on premerge.
Test passed on latest run: https://buildkite.com/ray-project/postmerge/builds/5393#0190a368-8228-4555-a1dc-3fde7391ed46
CI test windows://python/ray/tests:test_implicit_resource is flaky. Recent failures:
DataCaseName-windows://python/ray/tests:test_implicit_resource-END Managed by OSS Test Policy