Open OrenLeung opened 1 day ago
Seems like this issue is fixed by updating to the latest nightly as it contains https://github.com/pytorch/pytorch/commit/7e8dace0de6bb589e4fd8f37e8642819b80c0baa which reverts https://github.com/pytorch/pytorch/pull/137157
https://github.com/pytorch/pytorch/pull/137157 breaks ROCm/TransformerEngine
& my whole fp8 training codebase on MI300X as it removes MasqueradingAsCUDA
which it seems like ROCm/TransformerEngine
currently depends on
Problem Description
I am running into an error
no member named 'getCurrentHIPStreamMasqueradingAsCUDA' in namespace 'c10::hip'
when trying to installROCm/TransformerEngine
following the instructions in the README. Do you have any tips on how to resolve this error?Reprod
Error Trace
Operating System
Ubuntu
CPU
AMD CPU
GPU
AMD Instinct MI300X
ROCm Version
ROCm 6.2.0