Closed HuangChiEn closed 1 year ago
Thanks for your interest!
For IFT, v1.0 was trained on ~500k examples (all in mandarin) including manually written examples and examples from proprietary models. Also I wrote ~100 seed QA pairs and paraphrased by model-based approaches.
Lots of interesting mandarin instruction set are released on huggingface by the community. please check them out :)
btw i have re-listed our ift dataset on huggingface https://huggingface.co/datasets/yentinglin/traditional_mandarin_instructions
Thanks for releasing this amazing work. Since both training dataset are currently not available on huggingface due to license concern.
Could you please provide the spec of instruction tuning dataset?
We want to find the alternative tradition chinese dataset for the same spec.
Spec :