[ ] I've built my own container based off DLC (and I've attached the code used to build my own image)
Concise Description:
The included version of TransformerEngine (0.12.0) is not compatible with FlashAttention > 2.0.4 whilst recent transformer version require FlashAttention > 2.0.4
we are also working on a pip wheel for TEv1.11 (ETA 10/15) that will remove the version requirement for flash-attn and make it an optional dependency. That might be a good time to update the DLC.
Checklist
Concise Description: The included version of TransformerEngine (0.12.0) is not compatible with FlashAttention > 2.0.4 whilst recent transformer version require FlashAttention > 2.0.4
DLC image/dockerfile: 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:2.3.0-gpu-py311-cu121-ubuntu20.04-sagemaker
Current behavior: Old version, doesn't support recent versions of FA
Expected behavior: It should be usable with recent versions of FA/transformers
Additional context: