Closed hubertlu-tw closed 2 years ago
The failing test: run_optimizers/test_bfloat16 (test_fused_optimizer.TestFusedAdam)
in rocm-pytorch-master is related to a regression introduced by PyTorch upstream. It is captured in this Apex issue.
FAIL: test_bfloat16 (test_fused_optimizer.TestFusedAdam)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/apex/tests/L0/run_optimizers/test_fused_optimizer.py", line 109, in test_bfloat16
self.gen_single_type_test(param_type=torch.bfloat16, skip_assert=True)
File "/apex/tests/L0/run_optimizers/test_fused_optimizer.py", line 80, in gen_single_type_test
ref_optim.step()
File "/opt/conda/lib/python3.7/site-packages/torch/optim/optimizer.py", line 109, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/torch/optim/adam.py", line 171, in step
capturable=group['capturable'])
File "/opt/conda/lib/python3.7/site-packages/torch/optim/adam.py", line 226, in adam
capturable=capturable)
File "/opt/conda/lib/python3.7/site-packages/torch/optim/adam.py", line 255, in _single_tensor_adam
assert not step_t.is_cuda, "If capturable=False, state_steps should not be CUDA tensors."
AssertionError: If capturable=False, state_steps should not be CUDA tensors.
The test has been skipped in a commit of this PR.
Any conflicts when doing the IFU?
Sorry, unable to do a thorough review here since the diff is huge. But since unit test results are in order, I'm okay with merging this PR.
IFU-master-2022-07-29-conflicts_diff.txt