Closed j316chuck closed 2 months ago
Add fp32 to the set of valid inputs for attention layer.
Note:
full-eval-fp32-train-fp8-llama3-8b-metamath-4ep-ObqlFj
mixed_precision: full
torch-attn-full-eval-fp32-train-fp8-metamath-4ep-pmGJKN
Tested manually, there is no unit test for this.
@j316chuck I don't think this is correct. Flash attention does not support fp32 (unless that changed recently?)
Description
Add fp32 to the set of valid inputs for attention layer.
Note:
Tests:
full-eval-fp32-train-fp8-llama3-8b-metamath-4ep-ObqlFj
🔴mixed_precision: full
torch-attn-full-eval-fp32-train-fp8-metamath-4ep-pmGJKN
✅mixed_precision: full