Torch was not compiled with flash attention warning

dotnet / TorchSharp

A .NET library that provides access to the library that powers PyTorch.

MIT License

1.41k stars 183 forks source link

Open lostmsu opened 2 months ago

lostmsu commented 2 months ago

This is printed when I call functional.scaled_dot_product_attention:

[W914 13:25:36.000000000 sdp_utils.cpp:555] Warning: 1Torch was not compiled with flash attention. (function operator ())

I'm on Windows with TorchSharp-cuda-windows=0.103.0

travisjj commented 1 month ago

Can you show the actual line of code used? Are you getting the warning during runtime or at compile / interpret?

I don't see this warning, when using a CausalSelfAttention layer inside of a transformer architecture.

This is the line of code I used:

        // "Flash" attention
        var y = F.scaled_dot_product_attention(q, k, v, is_casual: true);

where q,k,v are the query, key, values from a Causal Attention linear layer.