Closed s1ghhh closed 1 year ago
I just came here to ask the same thing. I'm guessing it was TOU / TOS?
Sorry for the inconvenience. I took it down because (1) I wanted to further clean the dataset and (2) I did not provide the source for all answers, which would violate the license. However, Hugginface now hosts a much better StackExchange dataset at https://huggingface.co/datasets/HuggingFaceH4/stack-exchange-preferences.
Sorry for the inconvenience. I took it down because (1) I wanted to further clean the dataset and (2) I did not provide the source for all answers, which would violate the license. However, Hugginface now hosts a much better StackExchange dataset at https://huggingface.co/datasets/HuggingFaceH4/stack-exchange-preferences.
Thank you for your response, and I understand your situation. I have reviewed the link you shared, and the answers in this dataset are often lengthy, meaning that one question typically corresponds to multiple answers. May I ask if you set one question to correspond to one answer? Additionally, this data contains many HTML tags and links. Will you remove them? Once again, thank you for your response and sharing.
Hi @kbressem, I read the previous messages. In this case, will the dataset be made available in the next future?
No. The dataset on Hugging Face is really good and I see no benefit an uploading another crawl of the same data. Please give the Hugging Face dataset a try.
Thank you for open-sourcing such a fantastic project. Since the links in the README are no longer working, I would like to know where I can access the StackExchange dataset series.