addshore / wikicrowd

Tool for crowd sourced micro edits for Wikimedia
https://wikicrowd.toolforge.org/
MIT License
7 stars 4 forks source link

Remove duck from games #71

Open PiotrGackowski opened 2 years ago

PiotrGackowski commented 2 years ago

I was informed that I marked as "duck" images that are already marked by some specific species.

https://commons.wikimedia.org/wiki/File:Family_of_Australian_wood_ducks_(Chenonetta_jubata)_at_Queens_Gardens,_Perth,_November_2021_01.jpg

I checked this and in my opinion we should remove "duck" as game. I understand why we started with ducks, dogs and cats, but in my opinion we will never be able to exclude correctly species. And its also amount of photos - we have quite a lot of unrecognized cats, that we can mark, but ducks are much more "specific" and many of them will have some species. That was also reason that I didn`t added horses and other entries from https://commons.wikimedia.org/wiki/Category:Mammals_by_common_named_groups

PiotrGackowski commented 2 years ago

https://github.com/addshore/wikicrowd/pull/72/files

addshore commented 1 year ago

I'd love to dive into this one a bit more rather than jus remove duck. It'll just be a modeling issue on Wikidata, and then images that are ducks already would not appear. Not sure when I'll have time to look though.

andyli commented 1 year ago

Similar issue here. My file was already has "depicts vegan burger", but someone added "hamburger" to it via wikicrowd.

I would suggest wikicrowd to ignore files with existing structured data.

PiotrGackowski commented 1 year ago

@andyli - ducs are different issue, I created another issue ticket for hamburger https://github.com/addshore/wikicrowd/issues/83

andyli commented 1 year ago

My concern is that similar problem exist for every wikicrowd game, not only "hamburger" and "duck".

Wikicrowd's "depicts" suggestion is very likely to overlap with existing structured data on a file. That's why I suggested wikicrowd to ignore files with existing structured data.

Ideally, wikicrowd users should be presented with existing structured data, and asked whether wikicrowd's suggestion would be a good replacement or addition. That's more complex to design and implement though.

addshore commented 1 year ago

Wikicrowd's "depicts" suggestion is very likely to overlap with existing structured data on a file. That's why I suggested wikicrowd to ignore files with existing structured data.

So the issue is slightly complex, so I hope we can find a way to figure this out automatically from the ontologies on wikidata and have wikicrowd both stop adding these statements if the community doesn't think they are warranted, and also it can go and fix its previous mistakes too.

Generally, if the ontology is correct on Wikidata, then wikicrowd will not add a statement for a higher-level depicts item. Though it looks like for the relationship between Australian Wood Duck and duck that doesn't happen.

These 2 items appear to exist in separate parts of the graph, with very little semantically linking them. If you head up the wood duck tree you can see the taxon that relates to Q3736439 duck, image But the taxon and the common name don't share much of a relation.

Potentially only via this qualifier on an instance of statement

image

It looks like there are many more examples of this sort of pattern. you can find a bunch that link to organisms known by a particular common name (Q55983715) https://www.wikidata.org/wiki/Special:WhatLinksHere/Q55983715