ROCm / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
19 stars 19 forks source link

IFU-master-2022-07-29 #80

Closed hubertlu-tw closed 2 years ago

hubertlu-tw commented 2 years ago

IFU-master-2022-07-29-conflicts_diff.txt

hubertlu-tw commented 2 years ago

The failing test: run_optimizers/test_bfloat16 (test_fused_optimizer.TestFusedAdam) in rocm-pytorch-master is related to a regression introduced by PyTorch upstream. It is captured in this Apex issue.

FAIL: test_bfloat16 (test_fused_optimizer.TestFusedAdam)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/apex/tests/L0/run_optimizers/test_fused_optimizer.py", line 109, in test_bfloat16
    self.gen_single_type_test(param_type=torch.bfloat16, skip_assert=True)
  File "/apex/tests/L0/run_optimizers/test_fused_optimizer.py", line 80, in gen_single_type_test
    ref_optim.step()
  File "/opt/conda/lib/python3.7/site-packages/torch/optim/optimizer.py", line 109, in wrapper
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/optim/adam.py", line 171, in step
    capturable=group['capturable'])
  File "/opt/conda/lib/python3.7/site-packages/torch/optim/adam.py", line 226, in adam
    capturable=capturable)
  File "/opt/conda/lib/python3.7/site-packages/torch/optim/adam.py", line 255, in _single_tensor_adam
    assert not step_t.is_cuda, "If capturable=False, state_steps should not be CUDA tensors."
AssertionError: If capturable=False, state_steps should not be CUDA tensors.

The test has been skipped in a commit of this PR.

jithunnair-amd commented 2 years ago

Any conflicts when doing the IFU?

jithunnair-amd commented 2 years ago

Sorry, unable to do a thorough review here since the diff is huge. But since unit test results are in order, I'm okay with merging this PR.