Closed vaibhavad closed 4 months ago
Discovered while inspecting #68 , it is due to a condition introduced in transformers > 4.40, that passes None as the attention mask, which later defaults to causal attention mask.
Discovered while inspecting #68 , it is due to a condition introduced in transformers > 4.40, that passes None as the attention mask, which later defaults to causal attention mask.