Closed virgulvirgul closed 1 year ago
@virgulvirgul there is no new text to text data/metadata, only speech/text and speech/speech metadata. For text to text parallel data used for text to text machine translation training, you can refer to https://arxiv.org/abs/2207.04672 and/or https://opus.nlpl.eu/ and/or https://huggingface.co/datasets/allenai/nllb
Do you plan to share "Text to Text Metadata" translation dataset?