huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
128.91k stars 25.57k forks source link

DeepMind Retro #15550

Open shamanez opened 2 years ago

shamanez commented 2 years ago

🌟 New model addition

Basically a retrieval augmented model like RAG, but without expensive retriever end2end training

Open source status

NielsRogge commented 2 years ago

Hi,

No weights are available at the moment. I'd suggest to take a look at this repo: https://github.com/lucidrains/RETRO-pytorch

shamanez commented 2 years ago

Yeah, I saw it. Anyways I am thinking of combining Retro with RAG model. The results seem very promising also the model is very practical as recent work suggests end-to-end retriever training is very expensive and impractical.

shamanez commented 2 years ago

I would love to get some help :)

halixness commented 2 years ago

I would love to get some help :)

Hi! I am interested in this implementation and I can help for it!

shamanez commented 2 years ago

@halixness do you think we should modify the RAG implementation ?

tanmoyio commented 2 years ago

@shamanez @halixness @NielsRogge I would like to participate in the implementation, please let me know

shamanez commented 2 years ago

Hi All,

I think we can use the RAG pipeline. But we need to implement the reader and its loss function.

We might be able to use this.

any suggestions?

mcschmitz commented 2 years ago

I would love to get some help :)

I'd like to help as well!

yenicelik commented 1 year ago

yeah this model seems cool. feel like it's getting closer to true open source LLMs. anyone aware if there are similar open-source efforts as bloom to reproduce and publish this?

simhallq commented 1 year ago

Hi @shamanez, very I'm interested in RETRO. How far along are you in an implementation?

mcschmitz commented 1 year ago

I did some digging and apparently Nvidia is picking up this idea again. I found it via an article from April '23. It links to their GitHub repo and the corresponding paper by Nvidia. In this paper they describe how they reproduce RETRO including the generation of the pretraining corpus. However, in the paper they mention that the implementation and the weights are not open-sourced, but just being able to regenerate the pretraining corpus would get us one step ahead.