Revise #1060 in light of the discussion of #1039.
Namely, make tests ensure that dtype of grads is preserved (not necessarily the same as the dtype of params for e.g. mixed precision training).
Revise the patch #1060 to see if the initialization of the dtypes in the states are too stringent or not.
Revise #1060 in light of the discussion of #1039. Namely, make tests ensure that dtype of grads is preserved (not necessarily the same as the dtype of params for e.g. mixed precision training). Revise the patch #1060 to see if the initialization of the dtypes in the states are too stringent or not.