fuddl / wd

a browser extension for wikidata
https://wikidata.org/wiki/Wikidata:Tools/Wikidata_for_Firefox
GNU General Public License v3.0
36 stars 6 forks source link

P854 should not be added when source is Wikipedia, but P143/P4656 #118

Closed JeanFred closed 1 year ago

JeanFred commented 1 year ago

I noticed this edit: https://www.wikidata.org/w/index.php?title=Special%3AEntityPage%2FQ753749&curid=709129&diff=1765889071&oldid=1765889067, where was added reference URL: https://en.wikipedia.org/wiki/Category:Post-apocalyptic_video_games?oldid=1073535554#Pages_in_category

I believe P854 should not be used in such cases, but P143 together with P4656.

fuddl commented 1 year ago

P4656 should be easy. I only need to replace the property when url matches .+(mediawiki|wik(i(books|data|(m|p)edia|news|quote|source|species|versity|voyage)|tionary)|wmflabs)\.org.+.

Mapping the projects for P143 should be the hard part. Are both really necessary? Isn't it redundant?

JeanFred commented 1 year ago

Well I guess P143 existed 'long' before P4656 ; I can imagine there are some applications/tools/etc that rely on P143 (like, "give me all statements sourced from French Wikipedia" or smth?) but I don’t know of any myself to be honest.

I would say, ideally both would be done, but it does not need to be done by this extension either: I agree the mapping is hard to make, maintain and ship as part of this extension.

So I would say: please switch P4656, and that’s it. If P143 really is a must-have, someone can implement a separate bot process to populate it based on P4656.

derenrich commented 1 year ago

Why not just support the most common values for P143 and skip if it's unknown. That's what wwwyzzerdd does https://github.com/derenrich/wwwyzzerdd/blob/main/src/write.ts#L66

fuddl commented 1 year ago

@derenrich @JeanFred how about this. I will match the current url against this list of domains and if it matches, it will receive Wikimedia import URL instead of reference url. The value of imported from Wikimedia project will be the corresponding item (?project).

This should automatically update itself. What do you think?

JeanFred commented 1 year ago

@derenrich @JeanFred how about this. I will match the current url against this list of domains and if it matches, it will receive Wikimedia import URL instead of reference url. The value of imported from Wikimedia project will be the corresponding item (?project).

This should automatically update itself. What do you think?

That sounds perfect, yes

derenrich commented 1 year ago

Seems good

fuddl commented 1 year ago

okay, I will exclude projects that are just hosted on wikimedia websites since I only want to match hostnames:

SELECT ?hostname ?item WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
  ?project wdt:P31/wdt:P279* wd:Q14827288.
  ?project wdt:P856 ?url.
  BIND(REPLACE(STR(?project), "http://www.wikidata.org/entity/", "") as ?item).
  BIND(REPLACE(STR(?url), "^[a-z]+://", "") as ?sans_protocol).
  FILTER(!REGEX(?sans_protocol, "/.+$", "i")).
  BIND(REPLACE(?sans_protocol, "/$", '') as ?hostname).
}

Try this query

If you see a problem with this let me know.

fuddl commented 1 year ago

The latest verson (.304) will make source statements like these.

If you notice issues, please re-open this ticket.