LAION-AI / Open-Assistant

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
https://open-assistant.io
Apache License 2.0
36.92k stars 3.22k forks source link

Add Farsi Datasets to train base LLM #3625

Closed pourmand1376 closed 1 year ago

pourmand1376 commented 1 year ago

I add these datasets since the model doesn't seem to understand standard persian.

github-actions[bot] commented 1 year ago

:x: pre-commit failed. Please run pre-commit run --all-files locally and commit the changes. Find more information in the repository's CONTRIBUTING.md

github-actions[bot] commented 1 year ago

:x: pre-commit failed. Please run pre-commit run --all-files locally and commit the changes. Find more information in the repository's CONTRIBUTING.md

github-actions[bot] commented 1 year ago

:x: pre-commit failed. Please run pre-commit run --all-files locally and commit the changes. Find more information in the repository's CONTRIBUTING.md

github-actions[bot] commented 1 year ago

:x: pre-commit failed. Please run pre-commit run --all-files locally and commit the changes. Find more information in the repository's CONTRIBUTING.md

github-actions[bot] commented 1 year ago

:x: pre-commit failed. Please run pre-commit run --all-files locally and commit the changes. Find more information in the repository's CONTRIBUTING.md

olliestanley commented 1 year ago

The README will need a bit more of a description of what each dataset is, how it can be loaded, and how it can be used/what it is intended for (e.g. pretraining, instruction tuning)

pourmand1376 commented 1 year ago

Okay, I have read this tutorial and I will submit another pull request with open-assistant standards.