addshore / wikicrowd

Tool for crowd sourced micro edits for Wikimedia
https://wikicrowd.toolforge.org/
MIT License
7 stars 4 forks source link

camera: exclude "photos taken with" categories #56

Closed waldyrious closed 2 years ago

waldyrious commented 2 years ago

Not sure if category exclusions are recursive. If they're not, this PR is pretty much useless, since almost all of the contents of Category:Photographs by camera manufacturer are in its subcategories.

Assuming this works, it would fix some instances of "depicts a camera" questions, like File:View from Mount Seymour Trail.jpg, which is in Category:Taken with Olympus Air A01.

addshore commented 2 years ago

So the code looks down the tree from the top level. If while navigating that tree Category:Photographs by camera manufacturer is encountered, the code will not continue to explore down that tree!

So, will this code have the desired effect? Another alternative would be ignoring Taken with in the regex!

waldyrious commented 2 years ago

So, will this code have the desired effect?

I'm not sure. I mean, there is clearly a path from the top-level to Category:Olympus Air A01 > Category:Taken with Olympus Air A01 > File:View from Mount Seymour Trail.jpg without going through Category:Photographs by camera manufacturer, so naively, I would suspect the restriction included here could be bypassed.

The Taken with regex might more reliable. Perhaps we would use both approaches?

PiotrGackowski commented 2 years ago

@addshore short description how to block categories and what is difference between excluded categories and regex will be nice - right now I dont know difference between them.

addshore commented 2 years ago

@addshore short description how to block categories and what is difference between excluded categories and regex will be nice - right now I dont know difference between them.

Excluded categories must be an exact match with teh whole name of the category in order to skip it

So Category:Foo will only result in Category:Foo exactly being skipped.

However if we tweak the excludeRegex to include foo then any category with the word foo will be skipped.

You can test the regex easily in some online tools https://www.regextester.com/

image

I can provide more links for reading on regex too :) TLDR is that /(foo|bar|baz)/i means any category with foo or bar or baz