Open sweetdream779 opened 2 weeks ago
Hello! Can you, please, send me a link to the particular data you use? I tried to find it in the "ghostbusters" repo, but wasn't sure what files did you mean. Maybe you mean something from https://github.com/vivek3141/ghostbuster-data/ ?
Also, I can say from advance that if they used something generated by gpt-4 or gpt-4-o, it will likely NOT be detected by our method, even if applied correctly. :( We didn't manage to do anything about this. This seem to be the property of these new models. If the data was generated by gpt-3.5 or earlier, I can try to see what's exactly wrong.
Hello! I am trying to apply your method(threshold classifier based on PHD or MLE) for Ghostbusters(https://github.com/vivek3141/ghostbuster) dataset. Here is my code, which is based on yours:
And I get very low performance for 900 randomly taken samples (450 for "ai" category and 450 for "human" category): F1: 0.667 Accuracy: 0.500 AI Accuracy: 1.000 Human Accuracy: 0.000 TPR at 1.0% FPR: 0.3%
Threshold(13.5) was selected in such way: I took another 900 samples from the dataset and got the threshold from the [6, 15] range which gave the best F1 score.
Even if I take lower threshold based on your paper (e.g. 9.0), I still get low metrics: F1: 0.399 Accuracy: 0.536 AI Accuracy: 0.309 Human Accuracy: 0.762 TPR at 1.0% FPR: 0.3%
Is there something what I do wrong?