Until now, I don't seem to have found any details about the specific example of dataset organizing and training from scratch , Would fairseq2 intend to provide a separate trainer library and dataset library like lightning and accelerate ,datasets from hugging face.? I think that fairseq2 mainly focuses on model details like transformers library from hugging face now.
I'd like to decouple training and modeling so that it's more conducive to updating the training part quickly, one of the main pain points of self-supervised large models at the moment is that training is time-consuming and slow。
While, I hope that fairseq2 has a good recipe organized for example, maybe like speech open source projects (i.e. kaldi, espnet,wenet and so on), to solve fairseq example is unfriendly and some details missing for new users.
Until now, I don't seem to have found any details about the specific example of dataset organizing and training from scratch , Would fairseq2 intend to provide a separate trainer library and dataset library like lightning and accelerate ,datasets from hugging face.? I think that fairseq2 mainly focuses on model details like transformers library from hugging face now.
I'd like to decouple training and modeling so that it's more conducive to updating the training part quickly, one of the main pain points of self-supervised large models at the moment is that training is time-consuming and slow。 While, I hope that fairseq2 has a good recipe organized for example, maybe like speech open source projects (i.e. kaldi, espnet,wenet and so on), to solve fairseq example is unfriendly and some details missing for new users.