FreedomIntelligence / AceGPT

Apache License 2.0
111 stars 7 forks source link

Any plan to release the dataset? #10

Closed ashmalvayani closed 6 months ago

ashmalvayani commented 7 months ago

Hello. Thanks for your amazing work. Do you have any plans for releasing both the pre-training dataset (the entire 30B and 10B respectively for 7B and 13B models) and fine-tuning dataset (especially Quora-Arabic)? If they're already available, can you please attach a link to it?

wabyking commented 7 months ago

https://huggingface.co/datasets/FreedomIntelligence/Quora-Arabic-GPT4 https://huggingface.co/datasets/FreedomIntelligence/Quora-Arabic-GPT4 https://huggingface.co/datasets/FreedomIntelligence/Code-Alpaca-Arabic-GPT4

Please see the above links, we just make it public.