Closed xuzhao9 closed 1 day ago
The CI workflow failed because the CI runner does not come with libcuda.so installed. To install it, we need to install the NVIDIA driver package: https://github.com/pytorch-labs/tritonbench/blob/main/docker/tritonbench-nightly.dockerfile#L47
Thanks for explaining the details! I will close this PR and wait for upstream FA3 to fix and upgrade CUTLASS.
What does this PR do?
Fix the FA3 extension build by adding the cuda library.
The original flash-attn repo mentioned that
-lcuda
is required to build and install the FA3 library: https://github.com/Dao-AILab/flash-attention/blob/main/hopper/setup.py#L222C13-L223C31We need to add the library to xformers too to build the correct so file.
Test Plan:
Before this PR:
After this PR:
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged.