Closed zstreet87 closed 2 months ago
Retest Ubuntu-CPU please. Retest Ubuntu-GPU-multi please. Retest Ubuntu-GPU-single please.
Retest Ubuntu-GPU-multi please. Retest Ubuntu-GPU-single please.
Retest Ubuntu-GPU-multi please.
Retest Ubuntu-GPU-multi please.
Retest Ubuntu-GPU-multi please.
Retest Ubuntu-GPU-multi please
Retest Ubuntu-GPU-multi please
Retest Ubuntu-GPU-multi please
@zstreet87 this is always failed at //tensorflow/python/distribute:collective_all_reduce_strategy_test_xla_2gpu
in CI, could you have a quick check on your local as well to see whether is ok?
branch: r2.15-rocm-enhanced
command: tf-docker /tensorflow > bazel --bazelrc=/usertools/rocm.bazelrc test --config=rocm --run_under=//tensorflow/tools/ci_build/gpu_build:parallel_gpu_execute -- //tensorflow/python/distribute:collective_all_reduce_strategy_test_xla_2gpu
result: //tensorflow/python/distribute:collective_all_reduce_strategy_test_xla_2gpu PASSED in 54.0s
Is the CI using only 1 GPU?
oh, now all tests are PASSED. You can merge now.
Skipping multi-gpu test in rocm.bazelrc file for single gpu runs - needed for Navi