ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
34.28k stars 5.82k forks source link

CI test linux://rllib:learning_tests_multi_agent_cartpole_appo_multi_gpu is consistently_failing #46311

Open can-anyscale opened 5 months ago

can-anyscale commented 5 months ago

CI test linux://rllib:learning_tests_multi_agent_cartpole_appo_multi_gpu is consistently_failing. Recent failures:

DataCaseName-linux://rllib:learning_tests_multi_agent_cartpole_appo_multi_gpu-END Managed by OSS Test Policy

can-anyscale commented 5 months ago

This test is now considered as flaky because it has been failing on postmerge for too long. Flaky tests do not run on premerge.

can-anyscale commented 5 months ago

Test passed on latest run: https://buildkite.com/ray-project/postmerge/builds/5169#01905c00-480b-4ada-9163-29e9df95262a

can-anyscale commented 5 months ago

CI test linux://rllib:learning_tests_multi_agent_cartpole_appo_multi_gpu is consistently_failing. Recent failures:

DataCaseName-linux://rllib:learning_tests_multi_agent_cartpole_appo_multi_gpu-END Managed by OSS Test Policy

can-anyscale commented 5 months ago

CI test linux://rllib:learning_tests_multi_agent_cartpole_appo_multi_gpu is consistently_failing. Recent failures:

DataCaseName-linux://rllib:learning_tests_multi_agent_cartpole_appo_multi_gpu-END Managed by OSS Test Policy

can-anyscale commented 5 months ago

This test is now considered as flaky because it has been failing on postmerge for too long. Flaky tests do not run on premerge.

sven1977 commented 3 months ago

We have some APPO fixes in the pipeline. Will revisit this issue tomorrow ...

can-anyscale commented 3 months ago

Test passed on latest run: https://buildkite.com/ray-project/postmerge/builds/5898#01915995-8667-47c9-88f2-91f93712b6e1

can-anyscale commented 2 months ago

CI test linux://rllib:learning_tests_multi_agent_cartpole_appo_multi_gpu is consistently_failing. Recent failures:

DataCaseName-linux://rllib:learning_tests_multi_agent_cartpole_appo_multi_gpu-END Managed by OSS Test Policy

can-anyscale commented 2 months ago

This test is now considered as flaky because it has been failing on postmerge for too long. Flaky tests do not run on premerge.