eric-mitchell / detect-gpt

DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature
MIT License
341 stars 53 forks source link

Models not set to eval mode #17

Open bemi127 opened 6 months ago

bemi127 commented 6 months ago

Why are the models not set to eval() mode before running inference? If my understanding is correct, this means that dropout, etc. will be applied when perturbations are being generated and log likelihood is estimated.

joegenius98 commented 5 months ago

I'm no expert, but I believe because none of the models are actually trained in DetectGPT's experiments, there is not necessarily a need to put them in eval mode.

But I agree it is still good practice to do .eval() explicitly even when you're just loading pre trained weights

bemi127 commented 5 months ago

It is correct that the base models are not explicitly retrained or finetuned. However the perplexities in eval mode will be more accurate due to each attention layer having full context and not being influenced by dropout.

joegenius98 commented 5 months ago

Good thing is that we can try it out and see what happens