Open sweetdream779 opened 1 month ago
Hello! Can you, please, send me a link to the particular data you use? I tried to find it in the "ghostbusters" repo, but wasn't sure what files did you mean. Maybe you mean something from https://github.com/vivek3141/ghostbuster-data/ ?
Also, I can say from advance that if they used something generated by gpt-4 or gpt-4-o, it will likely NOT be detected by our method, even if applied correctly. :( We didn't manage to do anything about this. This seem to be the property of these new models. If the data was generated by gpt-3.5 or earlier, I can try to see what's exactly wrong.
As I understood from their paper, they used gpt-3.5-turbo model for generations. I uploaded the data that I used to google drive. Valid data link and test data link.
Thank you for a data links and sorry for a late response. I forgot about this issue due to a lot of work and remembered it only recently.
I made a test with this data and your code. I didn't find the problems in your code and manually checked phd on the examples in this dataset by myself, too. The difference between PH dimensions on human and ai subsets of the data are indeed too small to make a reasonable threshold-based classifier.
So it seems like you did everything correct, our threshold-based classifier simply doesn't work on this data.
Hello! I am trying to apply your method(threshold classifier based on PHD or MLE) for Ghostbusters(https://github.com/vivek3141/ghostbuster) dataset. Here is my code, which is based on yours:
And I get very low performance for 900 randomly taken samples (450 for "ai" category and 450 for "human" category): F1: 0.667 Accuracy: 0.500 AI Accuracy: 1.000 Human Accuracy: 0.000 TPR at 1.0% FPR: 0.3%
Threshold(13.5) was selected in such way: I took another 900 samples from the dataset and got the threshold from the [6, 15] range which gave the best F1 score.
Even if I take lower threshold based on your paper (e.g. 9.0), I still get low metrics: F1: 0.399 Accuracy: 0.536 AI Accuracy: 0.309 Human Accuracy: 0.762 TPR at 1.0% FPR: 0.3%
Is there something what I do wrong?