jacob-g / wikimonitor

The source code for the Scratch Wiki bot, WikiMonitor
Apache License 2.0
9 stars 5 forks source link

Improve unsigned post detection #5

Closed jacob-g closed 6 years ago

jacob-g commented 6 years ago

https://en.scratch-wiki.info/w/index.php?diff=192173 should be detected as being at the beginning of a page, but since there are two blank lines before the header, the first header isn't detected as being after it.

https://en.scratch-wiki.info/w/index.php?diff=192138 does not add a full line.

Additionally, there should be a way to test this on arbitrary edits (convert this into a function of the page contents and diff rather than just a section of the loop).

caker18Productions commented 6 years ago

Second

caker18Productions commented 6 years ago

I'm S-zhangcha, just in case you were wondering

jacob-g commented 6 years ago

Another example: https://en.scratch-wiki.info/w/index.php?diff=195423

jacob-g commented 6 years ago

Yet another: https://en.scratch-wiki.info/w/index.php?diff=195530

jacob-g commented 6 years ago

Per suggestion of kenny2scratch, this can be improved by using the unified DIFF instead of the HTML DIFF. See http://php.net/manual/en/function.xdiff-string-diff.php

jacob-g commented 6 years ago

The new codebase passes on all of these edits. Hopefully it's good enough now.