shizhouxing / LLM-Detector-Robustness

[TACL] Code for "Red Teaming Language Model Detectors with Language Models"
BSD 3-Clause "New" or "Revised" License
16 stars 3 forks source link

Apply for datasets #1

Open 1821518274 opened 1 week ago

1821518274 commented 1 week ago

Thank you very much for your excellent work. I have a strong interest in your research and I find your methods very inspiring and appreciate that you have open sourced your code.

However, I noticed that the datasets doesn't seem to be uploaded, can I request access to the dataset used in your research or can I find it elsewhere? I assure you that all data is used strictly for academic purposes and meets any conditions you specify.

If possible, I would appreciate it if you could upload the data somewhere, or send it to me via email at siming8741@gmail.com .

Thank you again for your contribution!

shizhouxing commented 1 week ago

Hi @1821518274 ,

Thanks for your interest in our work! Could you elaborate what dataset are you referring to? If you are referring to datasets like xsum, eli5, etc., they are automatically loaded from huggingface datasets (https://huggingface.co/docs/datasets/en/index): https://github.com/shizhouxing/LLM-Detector-Robustness/blob/master/dataset/load_data.py

1821518274 commented 1 week ago

Thanks for your reply, I was referring to the xsum and eli5 datasets. It seems then that it didn't load correctly because of my bad network, I will try to download it locally directly from huggingface. Thanks again for your reply and good luck with your work!