Closed xinyazhang closed 1 month ago
Tested on rocm-framework-51
(py_3.10) xinyazha@12fd1640b6b7:~/rocm-pytorch$ PYTORCH_TEST_WITH_ROCM=1 python test/distributed/_tensor/test_attention.py -k test_ring_attention_compile_attention_fn1 -v
test_ring_attention_compile_attention_fn1 (__main__.RingAttentionTest) ... [rank1]:[W530 08:06:08.493322199 ProcessGroupNCCL.cpp:1113] WARNING: process group has NOT been destroyed before it is being destructed. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL data transfers have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4
[rank0]:[W530 08:06:08.539286216 ProcessGroupNCCL.cpp:1113] WARNING: process group has NOT been destroyed before it is being destructed. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL data transfers have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4
skipped 'Test skipped at subprocess level, look at subprocess log for skip reason'
----------------------------------------------------------------------
Ran 1 test in 4.512s
OK (skipped=1)
The UT does not skip efficient attention operator on platforms that do not support efficient attention. Support of flash attention does not always ensure efficient attention is implemented.
Fixes #SWDEV-459621