ROCm / tensorflow-upstream

TensorFlow ROCm port
https://tensorflow.org
Apache License 2.0
684 stars 93 forks source link

Adjust rocm-smi check for test scripts #2505

Closed jayfurmanek closed 5 months ago

jayfurmanek commented 5 months ago

The rocm-smi tool now has other kinds of "ID" output so we have to specify.

jayfurmanek commented 5 months ago

retest Ubuntu-CPU please

jayfurmanek commented 5 months ago

retest Ubuntu-GPU-multi please

jayfurmanek commented 5 months ago

retest gpu-pycpp please

jayfurmanek commented 5 months ago

retest Ubuntu-GPU-multi please

i-chaochen commented 5 months ago

recent we have many CI jobs are failed at //tensorflow/python/distribute:collective_all_reduce_strategy_test_xla_2gpu at sc-hw-smc-acc-12 and sc-hw-smc-acc-09

i-chaochen commented 5 months ago

retest Ubuntu-GPU-multi please