Closed leej3 closed 6 months ago
@leej3 can you check why https://github.com/pytorch/ignite/actions/runs/8451157627/job/23149407599?pr=3220 is failing?
@leej3 can you check why https://github.com/pytorch/ignite/actions/runs/8451157627/job/23149407599?pr=3220 is failing?
looking through it now...
I refactored the commits to make it easier to spot the modifications to the files that were in ignite.contrib.metrics and are now in ignite.metrics.
I can't spot what might be causing the failure of the TPU tests though. The diff is small but persists across reruns:
def test_distrib_single_device_xla():
device = idist.device()
> _test_distrib_compute(device)
tests/ignite/metrics/regression/test_mean_error.py:233:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/ignite/metrics/regression/test_mean_error.py:127: in _test_distrib_compute
_test("cpu")
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
metric_device = device(type='cpu')
def _test(metric_device):
metric_device = torch.device(metric_device)
m = MeanError(device=metric_device)
y_pred = torch.rand(size=(100,), device=device)
y = torch.rand(size=(100,), device=device)
m.update((y_pred, y))
y_pred = idist.all_gather(y_pred)
y = idist.all_gather(y)
np_y = y.cpu().numpy()
np_y_pred = y_pred.cpu().numpy()
np_sum = (np_y - np_y_pred).sum()
np_len = len(np_y_pred)
np_ans = np_sum / np_len
> assert m.compute() == pytest.approx(np_ans)
E assert 0.003967249393463134 == 0.003967256546020508 ± 4.0e-09
E
E comparison failed
E Obtained: 0.003967249393463134
E Expected: 0.003967256546020508 ± 4.0e-09
I didn't discover anything especially useful from ssh-ing into the tpu tests machine. Some of the tests in ignite/metrics/regression fail intermittently on master and this branch. The failures are deterministic but failures are triggered by things like the order the tests are run. For example running pytest regression/test_mean_error.py
reliably passes on both branches but pytest regression
reliably fails on both branchs.
I have adjusted the tolerance of the comparisons to make it pass more reliably.
Same migration pattern as #3204:
references are updated as part of this PR to avoid failures in building the docs