Open astachowiczhabana opened 11 hours ago
Transformers v4.45 introduced sdpa as the default implementation in Albet. This caused performance drop. Adding Albert to the list of models which don't yet have sdpa implementation in Gaudi and use thus eager attention.
Hi @libinta this commit is also required with next OH release
Transformers v4.45 introduced sdpa as the default implementation in Albet. This caused performance drop. Adding Albert to the list of models which don't yet have sdpa implementation in Gaudi and use thus eager attention.