Low-code framework for building custom LLMs, neural networks, and other AI models
11.19k
stars
1.19k
forks
source link
Use torch >= 2.1.1 in Docker images to enable SDPA dispatching via Flash Attention 2 for faster training and inference #3908
Open
arnavgarg1 opened 10 months ago
Enabled by transformers on Torch>=2.1.1.
We should make this work so that we get performance improvements using newer transformers versions.