Closed ShadowTinker closed 6 months ago
Hi. Thanks for your interest! We have released the original SOTA data pool on: https://huggingface.co/datasets/AndrewZeng/deita_sota_pool
Thanks a ton for your help with the dataset issue I raised! I greatly appreciate the time you took to address my problem. Your work on the repository is amazing, and you're clearly committed to helping the community.
Hi, First of all, thank you for your work and the great repo!
As stated in the title, could you please provide the original data pool used in your paper, especially $X_{sota}$. I have tried to obtain the dataset following the reference in the paper. However, I cannot find a version of ShareGPT and UltraChat Huggingface datasets that match the statistics stated in the paper. I would greatly appreciate it if you could provide the dataset or teach me how to filter out the two datasets from existing Huggingface datasets.
Best regards