UL2 Mixture-of-Denoiser loss

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Apache License 2.0

133.31k stars 26.63k forks source link

Feature request

The losses applied to the paper UL2: Unifying Language Learning Paradigms The Mixture-of-Denoisers losses are described in the UL2 paper, which can be found at the following link: https://arxiv.org/abs/2205.05131 The code is based on T5x (which is JAX/FLAX): https://github.com/google-research/t5x

Motivation

I am requesting the addition of new losses applied in the UL2 paper called Mixture-of-Denoisers. These new losses have been shown to improve the performance of unsupervised learning models and I believe they could benefit the HuggingFace community.

Your contribution

Opening the request

huggingface / transformers