Closed NovemLinguae closed 11 months ago
Note to self. Algorithm idea to fix this last case ("someone puts a category in the wrong place (not at the bottom of the article)"). Will code this up later and add to patch (and will add some more test cases):
let textBetweenFirstCategoryAndEndOfFile = wikitext.match(/\[\[:?Category:.*$/);
// delete categories from sampled text
textBetweenFirstCategoryAndEndOfFile = textBetweenFirstCategoryAndEndOfFile.replace(/\[\[:?Category:[^\]]+\]\]/g, '');
// does the non-category sample text have anything except whitespace?
let hasNonWhitespace = textBetweenFirstCategoryAndEndOfFile.match(/\S/);
if ( hasNonWhitespace ) {
return;
}
Note to self: maybe I can fix this by tweaking the existing regex. Look into some of the ideas in this article, in the "Possessive Quantifiers and Atomic Grouping to The Rescue" section:
OK, I rewrote this and solved all the issues. Ready for review. This alg should be identical to the old alg, but is iterative instead of using regex, so no catastrophic backtracking problems.
Fix #245
This new code is vulnerable to deleting the wrong heading if someone puts a category in the wrong place (not at the bottom of the article), but I think that's an acceptable tradeoff for now. If it actually affects someone we can make the solution more complex in a future patch.