Closed ZhaiFeiyue closed 1 year ago
@regisss I have tested the perf between FP32 and BF16, data is below on Gaudi2.
precision | train_samples_per_second | eval perplexity |
---|---|---|
FP32 | 47.624 | 21.0109 |
BF16 | 55.511 | 21.177 |
precision | train_samples_per_second | eval perplexity |
---|---|---|
FP32 | 306.631 | 21.7935 |
BF16 | 357.932 | 22.1765 |
is the above accuracy acceptable?
closed since PR has been merged.
yes, I will submit a PR to disable HMP here, and should I remove
mul
from FP32 list from here or remove all the BF16 and FP32 ops and just changeuse_habana_mixed_precision": true
?
I think you can just remove mul
in the Gaudi config of GPT2, that will make things clearer regarding which ops are computed in bf16.
@regisss I will follow your comments after PR #232 merged.
@ZhaiFeiyue We can close this one right?
@regisss yes
Feature request
enable HMP for GPT2
Motivation
BF16 has better performance than FP32
Your contribution
submitting a PR