LAION-AI / Open-Assistant

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
https://open-assistant.io
Apache License 2.0
37.04k stars 3.23k forks source link

Add Tatoeba QnA dataset #3115

Closed echo0x22 closed 1 year ago

echo0x22 commented 1 year ago

Multilingual Tatoeba Q&A Translation Dataset

120K entries

This dataset contains a list of instructions to translate or paraphrase in multiple languages. It is available in Parquet format and includes the following columns:

The data in this dataset was collected through crowdsourcing efforts and includes translations of various types of content, such as sentences, phrases, idioms, and proverbs.

You can find it here: https://huggingface.co/datasets/0x22almostEvil/tatoeba-mt-qna-oa Original dataset is available here: https://huggingface.co/datasets/Helsinki-NLP/tatoeba_mt

dewasahu2003 commented 1 year ago

@0x22almostEvil Hi 👋 Could i take this issue

echo0x22 commented 1 year ago

@0x22almostEvil Hi 👋 Could i take this issue

I've already made it (ckeck PR above) 😅 Created an issue due to "dataset pull-request" guidelines in README

echo0x22 commented 1 year ago

Screenshot_20230510-212141_Bromite

dewasahu2003 commented 1 year ago

okay