google-deepmind / dm-haiku

JAX-based neural network library
https://dm-haiku.readthedocs.io
Apache License 2.0
2.91k stars 231 forks source link

If I am understanding correctly, one line of the model code of #773

Closed copybara-service[bot] closed 8 months ago

copybara-service[bot] commented 8 months ago

If I am understanding correctly, one line of the model code of the Haiku transformer example depends on the pad token being zero (or, at least, less than the non-pad tokens). This proposed modification is aimed at removing that requirement.