Open ArkadeepAcharya opened 1 week ago
Hi Arkadeep,
Thanks a lot for your interest in our work!
Yes, I very much plan on making the models available. :)
I am currently working on refining Stage 1, such that Stage 2 won't be necessary. My sincere hope is that I can then release a single pre-trained model which can easily be fine-tuned on any downstream task without task distillation for maximum performance.
In any case, I will make the self-supervised adapted model (S1) of the paper available asap. Unfortunately, directly fine-tuning that will only give you good performance if you have sizable training data (like for NLI, Belebele).
Cheers, Fabian
Thanks Fabian! Looking forward to the model release!
As a quick update. Sharing the model on Hugging Face hub is surprisingly difficult, since it has to be correctly quantized and LoRAfied prior to loading the weights. transformers
, peft
, bitsandbytes
don't play that easily nicely together when setting up an AutoModel.from_pretrained
the conventional way. Unfortunately, none of this is really well documented.
Between having been sick and working on the more general model, I haven't yet had sufficient time how to best upload the model in a way that it is most easily used, i.e.
from transformers import AutoModel
model = AutoModel.from_pretrained("fdschmidt/nllb-llm2vec-v0.1")
I might have to package it more generally as an nn.Module
(cf. https://huggingface.co/docs/hub/models-uploading#upload-a-pytorch-model-using-huggingfacehub). I'll be on vacation next week but will try to squeeze it in.
Request for Release of Pretrained NLLB-LLM2Vec Model
Hello Team,
Could you please release the pretrained NLLB-LLM2Vec models mentioned in your paper on "Self-Distillation for Model Stacking Unlocks Cross-Lingual NLU in 200+ Languages"? It would greatly benefit the community by facilitating further research.
Thank you for your contributions.
Best regards, Arkadeep Acharya