yuvalkirstain / PickScore

MIT License
373 stars 20 forks source link

Banned User List for dataset #13

Closed lrzjason closed 6 months ago

lrzjason commented 6 months ago

Thanks for the great work. I have read your paper. It stated that the dataset would be filtered with banned user. I am working on v2 dataset. Could you provide a banned user list for a reference? After I filtered around 2000 pairs, I noticed some user might intented to choose the wrong choice. I don't know if the banned user list would help to filter those 'bad user'.

Also, If I want to fine tune the model based on my filtered pairs, how should I prepare the dataset? Apperciate for any advise.

yuvalkirstain commented 6 months ago

https://github.com/yuvalkirstain/heroku_app/blob/main/main.py#L114 - you can see here all the banned users. I also suggest to use some better word filter to find malicious users. Thank you so much, and can't wait to see you new cleaner version!

lrzjason commented 6 months ago

Thanks for reply. Very apperciated.