Closed 123epsilon closed 1 month ago
@123epsilon, Thanks for bringing it to our attention, we updated our documentation to clearly list the conditions in rocBLAS to obtain deterministic results. ( https://github.com/ROCm/rocBLAS/commit/42d65e162544b17157607cb643142b0682803e4f)
Atomic operations are enabled by default in current and previous releases of rocBLAS and functions using atomic operations may not provide deterministic results.
The documentation you are referring to is the changelog for rocBLAS 4.0. Our deprecation process involves notifying end users and removing the feature in the next major version change of ROCm release. In this case, we issued a deprecation notice in ROCm 6.0 and the actual change could occur in the next major version.
Bitwise Reproducibility section provides the conditions in which rocBLAS guarantees deterministic results.
In ROCm 6.2 and above users can use ROCBLAS_DEFAULT_ATOMICS_MODE
environment variable to change the default atomic mode. For prior releases user must use rocBLAS rocblas_set_atomics_mode()
API to change the default. [Refer to section on Atomic-operations]
I see, thank you!
@123epsilon, If there are no additional questions, I will proceed to close this issue.
Yes that's all I needed to know - thank you!
Hi, I'm using rocBLAS as a backend for hipBLAS and I wanted to know what the determinism guarantees are for GPU atomics? Specifically I am using it with PyTorch, and I notice that cuBLAS specifically mentions settings in their documentation that can be used to get bit-wise deterministic behavior:
I see in rocBLAS documentation that after rocBLAS 4.0, by default atomics are not utilized. Does that mean that I can safely assume that the use of atomics in rocBLAS < 4.0 is nondeterministic and that - if left in default settings that rocBLAS >= 4.0 will give deterministic behavior? Are there any other settings such as the workspace size that factor in to this?