Closed FdyCN closed 11 months ago
Because there's a compiler bug that causes runtime crashes. The only workaround is to reassign index 50000, so it doesn't conflict with the index 101 in the other GPU kernel.
Because there's a compiler bug that causes runtime crashes. The only workaround is to reassign index 50000, so it doesn't conflict with the index 101 in the other GPU kernel.
@philipturner thanks for your reply, which means, there is some other GPU kernel from systerm or Xcode SDK who is using index 101?
The other GPU kernel in MFA is using 101. It isn't supposed to cause issues, but with a specific version of the Xcode SDK that built this, there was an issue. It was a bug, not intended behavior from the compiler.
I found that constant index in Attention.metal is: 100->50000->102->103 https://github.com/philipturner/metal-flash-attention/blob/32592c98eff18001d4eec2c7a204e288fa92fa44/Sources/Attention.metal#L39