MiuLab / Taiwan-LLM

Traditional Mandarin LLMs for Taiwan
https://twllm.com
Apache License 2.0
965 stars 84 forks source link

Training datasets are not available #27

Closed Lifulifu closed 8 months ago

Lifulifu commented 10 months ago

The training datasets yentinglin/traditional_mandarin_instructions and yentinglin/zh_TW_c4 are currently not available on huggingface. Will they ever be available again? Or are there any other ways to access the data?

adamlin120 commented 9 months ago

Due to legal concerns raised by our legal advisors regarding copyrighted material, we've temporarily removed the datasets. We're actively seeking further opinions and hope to make the training datasets available again soon. Rest assured, the models are not affected by this issue.

經過專業律師的法律建議,我們暫時將有版權疑慮的資料集下架。我們會積極尋求更多的法律意見,希望能儘快重新上架這些訓練資料集。模型本身並無此疑慮,請大家放心。