Closed copybara-service[bot] closed 8 months ago
If I am understanding correctly, one line of the model code of the Haiku transformer example depends on the pad token being zero (or, at least, less than the non-pad tokens). This proposed modification is aimed at removing that requirement.
If I am understanding correctly, one line of the model code of the Haiku transformer example depends on the pad token being zero (or, at least, less than the non-pad tokens). This proposed modification is aimed at removing that requirement.