BurhanUlTayyab / DetectGPT

Pytorch implementation of DetectGPT (https://arxiv.org/pdf/2301.11305v1.pdf)
https://gptzero.sg
MIT License
177 stars 43 forks source link

Bug in chooseBestFittingText? #5

Closed bjornhertzberg closed 1 year ago

bjornhertzberg commented 1 year ago

When run the code I notice that the return_text is scrambled. I believe this is due to a bug in the search pattern you use in re.finditer.

Current: mask_indices = list(re.finditer("[MASK]", mask_text))

Proposed: mask_indices = list(re.finditer("[MASK]", mask_text))

The current implementation gives me the position of the letters M, A, S and K (so span is always =1), but you want to know the position of the full string (including the opening and closing brackets).

I also notice there is an issue with the offset that I need to adjust for (removing the -1 when setting the start position and making adjustments to the offset in each loop).

After the corrections the function seems to work as I understand the intention.

NCwork commented 1 year ago

Yes you are right, thanks for bring it up!

BurhanUlTayyab commented 1 year ago

Fixed