use tensor.copy if tensor.set fails. apex.amp monkey-patches
optimizer.zerograd to set all grads to None. torch creates a new p.grad
for each parameter. calling p.grad.set() causes an error:
RuntimeError: set_storage is not allowed on a Tensor created from .data or .detach().
If your intent is to change the metadata of a Tensor (such as sizes / strides / storage / storage_offset)
without autograd tracking the change, remove the .data / .detach() call and wrap the change in a `with torch.no_grad():` block.
For example, change:
x.data.set_(y)
to:
with torch.no_grad():
x.set_(y)
Signed-off-by: yulu.jia yulu.jia@bytedance.com