Open VarunGumma opened 1 year ago
Hi, thanks for your interest. I'm working on that. It should arrive in a week or two.
@gwenzek any update on the documentation
Hi, thanks for your interest. I'm working on that. It should arrive in a week or two.
Any update on this @gwenzek ?
@VarunGumma @abdr17 We are working on our first open-source training recipe right now and plan to release it in early January. I will keep you posted once it is released.
Hi @cbalioglu , I know generative models are on hype at the moment, but I am curious if fairseq2 would also support e.g. RoBERTa/XLM-R/data2vec pretraining :thinking: Would be awesome to have!
@cbalioglu Could you please tell what kind of model training you are going to release? (e.g., BERT, GPT, etc.)
Hey @netw0rkf10w, @stefan-it , the plan is to have training recipes for NLLB (encoder/decoder machine translation), wav2vec2/w2v-BERT (encoder-based speech embedding SSL), and fine-tuning recipe for LLaMA 7B/70B (in this order) in January/February. My goal is to cover the major architectures available in fairseq, so I can certainly try to prioritize other models depending on the interest/demand. Please let me know if you have some particular models in mind.
Hi @cbalioglu , I would definitely vote for XLM-RoBERTa (there's still a lot of interest in, see EMNLP 2023 paper of XLM-V) and data2vec (1 and 2), because it has very promising and fresh new training objectives :)
Many thanks in advance!
Thanks for the reply @cbalioglu ! I would like to vote for RoBERTa and data2vec as well!
@cbalioglu Along these lines, I was wondering if fairseq2 is ready for multi-node pre-training and fine-tuning of wav2vec2? I currently use fairseq which has great support for efficiently doing distributed training. Will I lose any of that by switching to fairseq2?
Hey @Pchatain, the recipe for encoder-decoder based machine translation is pretty much ready and I expect to merge it this week. I have started working on wav2vec2 and w2v-bert pretraining recipes (since we require them for our ongoing projects) and they will be ready in the next few weeks (definitely in March). Using those recipes will give you the same functionality as in fairseq.
@cbalioglu any updates?
@cbalioglu Hi, thanks for your work. I would like to vote for data2vec(1,2) too!
Any news on this?
Are there any updates regarding w2v-bert?
Is there an estimated timeline for when we will have documentation and training recipes for fariseq2 models specifically w2v-bert?
Hi all, if you look at the github commits https://github.com/facebookresearch/fairseq2/commits/main/ you can see that they are working on it, I am not sure constant question about it is useful (@cbalioglu correct me if I am wrong).
I assume a +1 to their posts will be enough. On the other side @cbalioglu if possible, maybe stating training which models will be implemented will already give ppl some assurance that they should wait and not try to move to https://github.com/NVIDIA/NeMo or https://github.com/espnet/espnet ;-)
Hey folks, sorry for the delays. As @orena1 mentioned, we are actively working on the training recipes including more conventional ones like wav2vec2 and BERT originating from fairseq, as well as LLM pretraining and finetuning. We do use/develop these recipes internally in FAIR for various projects, so we want to make sure that they have full parity and expected runtime/model performance. We are very close to release the recipes for wav2vec2 pretraining, wav2vec2 ASR finetuning, and LLM instruction finetuning in the next few weeks.
@cbalioglu Thank you for your work! Looking forward to the wav2vec2 pretraining recipe!
Is there any documentation or
examples
that I can refer to train a transformer model from scratch usingfairseq2
? Theexamples
folder in the repository seems empty.