kingoflolz / mesh-transformer-jax

Model parallel transformers in JAX and Haiku
Apache License 2.0
6.29k stars 892 forks source link

Using `no_repeat_ngram_size` like HF #204

Closed nikhilanayak closed 2 years ago

nikhilanayak commented 2 years ago

Is there any way I can use something like the no_repeat_ngram_size feature from HuggingFace to generate text like this? When I generate my text is very repetitive.

nikhilanayak commented 2 years ago

Never mind, I figured out how to convert the model to HF. Is there an HF version with TPU support?

kingoflolz commented 2 years ago

Not as far as I know, you will likely have better results asking in the HF github/slack

leejason commented 2 years ago

I figured out how to convert the model to HF

Very interesting. Is it possible to know how you did it? I tried several times but failed.