Closed XuandongZhao closed 5 months ago
Our new ICML paper studies the membership inference attack (data contamination) in a black-box manner.
Title: DE-COP: Detecting Copyrighted Content in Language Models Training Data (ICML 2024) Paper: https://arxiv.org/abs/2402.09910 Code: https://github.com/LeiLiLab/DE-COP?tab=readme-ov-file
Dataset: https://huggingface.co/datasets/avduarte333/BookTection https://huggingface.co/datasets/avduarte333/arXivTection
It would be my pleasure if my work could be included in the repo. Thank you!
Thanks for letting me know I missed your work. I have added your paper to the list.
Our new ICML paper studies the membership inference attack (data contamination) in a black-box manner.
Title: DE-COP: Detecting Copyrighted Content in Language Models Training Data (ICML 2024) Paper: https://arxiv.org/abs/2402.09910 Code: https://github.com/LeiLiLab/DE-COP?tab=readme-ov-file
Dataset: https://huggingface.co/datasets/avduarte333/BookTection https://huggingface.co/datasets/avduarte333/arXivTection
It would be my pleasure if my work could be included in the repo. Thank you!