Closed yichunk closed 3 months ago
Add cross attention layer and refactor t5_attention to inherit from cross attention.
Test the t5 conversion and work.
BUG=b/311216181
Add cross attention layer and refactor t5_attention to inherit from cross attention.
Test the t5 conversion and work.
BUG=b/311216181