ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.19k stars 1.19k forks source link

Use torch >= 2.1.1 in Docker images to enable SDPA dispatching via Flash Attention 2 for faster training and inference #3908

Open arnavgarg1 opened 10 months ago

arnavgarg1 commented 10 months ago

Enabled by transformers on Torch>=2.1.1.

We should make this work so that we get performance improvements using newer transformers versions.