facebookresearch / fairseq2

FAIR Sequence Modeling Toolkit 2
https://facebookresearch.github.io/fairseq2/
MIT License
660 stars 72 forks source link

Missing documentation on how to train a model #80

Open VarunGumma opened 11 months ago

VarunGumma commented 11 months ago

Is there any documentation or examples that I can refer to train a transformer model from scratch using fairseq2? The examples folder in the repository seems empty.

gwenzek commented 11 months ago

Hi, thanks for your interest. I'm working on that. It should arrive in a week or two.

VarunGumma commented 9 months ago

@gwenzek any update on the documentation

abdr17 commented 8 months ago

Hi, thanks for your interest. I'm working on that. It should arrive in a week or two.

Any update on this @gwenzek ?

cbalioglu commented 8 months ago

@VarunGumma @abdr17 We are working on our first open-source training recipe right now and plan to release it in early January. I will keep you posted once it is released.

stefan-it commented 7 months ago

Hi @cbalioglu , I know generative models are on hype at the moment, but I am curious if fairseq2 would also support e.g. RoBERTa/XLM-R/data2vec pretraining :thinking: Would be awesome to have!

netw0rkf10w commented 7 months ago

@cbalioglu Could you please tell what kind of model training you are going to release? (e.g., BERT, GPT, etc.)

cbalioglu commented 7 months ago

Hey @netw0rkf10w, @stefan-it , the plan is to have training recipes for NLLB (encoder/decoder machine translation), wav2vec2/w2v-BERT (encoder-based speech embedding SSL), and fine-tuning recipe for LLaMA 7B/70B (in this order) in January/February. My goal is to cover the major architectures available in fairseq, so I can certainly try to prioritize other models depending on the interest/demand. Please let me know if you have some particular models in mind.

stefan-it commented 7 months ago

Hi @cbalioglu , I would definitely vote for XLM-RoBERTa (there's still a lot of interest in, see EMNLP 2023 paper of XLM-V) and data2vec (1 and 2), because it has very promising and fresh new training objectives :)

Many thanks in advance!

netw0rkf10w commented 7 months ago

Thanks for the reply @cbalioglu ! I would like to vote for RoBERTa and data2vec as well!

Pchatain commented 6 months ago

@cbalioglu Along these lines, I was wondering if fairseq2 is ready for multi-node pre-training and fine-tuning of wav2vec2? I currently use fairseq which has great support for efficiently doing distributed training. Will I lose any of that by switching to fairseq2?

cbalioglu commented 6 months ago

Hey @Pchatain, the recipe for encoder-decoder based machine translation is pretty much ready and I expect to merge it this week. I have started working on wav2vec2 and w2v-bert pretraining recipes (since we require them for our ongoing projects) and they will be ready in the next few weeks (definitely in March). Using those recipes will give you the same functionality as in fairseq.

jcuenod commented 4 months ago

@cbalioglu any updates?

cageyoko commented 4 months ago

@cbalioglu Hi, thanks for your work. I would like to vote for data2vec(1,2) too!

JAVI897 commented 4 months ago

Any news on this?

kssmmm commented 3 months ago

Are there any updates regarding w2v-bert?

kdcyberdude commented 3 months ago

Is there an estimated timeline for when we will have documentation and training recipes for fariseq2 models specifically w2v-bert?

orena1 commented 3 months ago

Hi all, if you look at the github commits https://github.com/facebookresearch/fairseq2/commits/main/ you can see that they are working on it, I am not sure constant question about it is useful (@cbalioglu correct me if I am wrong).

I assume a +1 to their posts will be enough. On the other side @cbalioglu if possible, maybe stating training which models will be implemented will already give ppl some assurance that they should wait and not try to move to https://github.com/NVIDIA/NeMo or https://github.com/espnet/espnet ;-)

cbalioglu commented 3 months ago

Hey folks, sorry for the delays. As @orena1 mentioned, we are actively working on the training recipes including more conventional ones like wav2vec2 and BERT originating from fairseq, as well as LLM pretraining and finetuning. We do use/develop these recipes internally in FAIR for various projects, so we want to make sure that they have full parity and expected runtime/model performance. We are very close to release the recipes for wav2vec2 pretraining, wav2vec2 ASR finetuning, and LLM instruction finetuning in the next few weeks.

gau-nernst commented 3 months ago

@cbalioglu Thank you for your work! Looking forward to the wav2vec2 pretraining recipe!

jcuenod commented 5 days ago

https://github.com/facebookresearch/fairseq2/tree/main/src/fairseq2/recipes