Flash attention only supports architecture with mp version 2.2. But now attempts to run on a mp 2.1 gpu. (Triggered internally at /home/torch_musa/torch_musa/csrc/aten/ops/attention/mudnn/SDPUtils.h:28.)
return func(*args, **kwargs)
muDNN(v2400) 2024-03-27 21:57:33.445104 0d:0h:0m:20s TID=0xbea59ff44d8e6e9f GPU=0 Handle=0x7b6d170 ERROR# NOT_SUPPORTED in Reduce::Run, Reason:
The reduce mode(PROD) not support input dtype(BOOL) & output dtype(INT64)
Flash attention only supports architecture with mp version 2.2. But now attempts to run on a mp 2.1 gpu. (Triggered internally at /home/torch_musa/torch_musa/csrc/aten/ops/attention/mudnn/SDPUtils.h:28.) return func(*args, **kwargs) muDNN(v2400) 2024-03-27 21:57:33.445104 0d:0h:0m:20s TID=0xbea59ff44d8e6e9f GPU=0 Handle=0x7b6d170 ERROR# NOT_SUPPORTED in Reduce::Run, Reason: The reduce mode(PROD) not support input dtype(BOOL) & output dtype(INT64)