tomayac / wikipedia-tools-for-google-spreadsheets

Wikipedia Tools for Google Spreadsheets — Install:
https://gsuite.google.com/marketplace/app/wikipedia_and_wikidata_tools/595109124715?pann=cwsdp&hl=en
Apache License 2.0
141 stars 32 forks source link

Add lookup for external IDs #13

Closed pigsonthewing closed 3 years ago

pigsonthewing commented 7 years ago

Given a sheet with a column containing values for a Wikidata property of the "external ID" type, it would be helpful, please, to have tool which will return the corresponding Wikidata QID.

The tool should take as part of its input a PID for the property (e.g. P496 for ORCID iDs).

If more than one match is found, a warning (error message) should be returned.

Optional enhancement: If the value is a URI, the ID should be obtained by reference to the property's formatter URL (P1630). For example, for the URI https://orcid.org/0000-0002-1003-5675 the value is 0000-0002-1003-5675, obtained by removing the formatter URL ("https://orcid.org/$1") after first discarding the latter's "$1" placeholder. Note that some formatter URLs have "$1" in the middle of the string, rather than at the end. Trailing slashes, optional "www." and alternative protocols (http:// vs. https://) should also be handled.

tomayac commented 7 years ago

So just to make sure I get this right: you have a column with ORCiD identifiers like below:

Identifier
0000-0000-1234-5678
0000-0000-5678-1234

You then want a function like WIKIDATAIDENTIFIERLOOKUP() that would find out (or be passed the information) that 0000-0000-1234-5678 (or the URL https://orcid.org/0000-0002-1003-5675) is of type ORCiD, and then return the corresponding Wikidata qid, right? Concretely speaking, for 0000-0002-1003-5675 it would be Q8134165 (Mike Taylor).

pigsonthewing commented 7 years ago

Yes, that's exactly it, thank you.

Of course the IDs may be ORCID, or VIAF, or IMDb, or any other with the external-ID datatype (dynamic list at https://www.wikidata.org/wiki/Special:ListProperties/external-id )

Maybe WIKIDATAIDLOOKUP() for short ;-)

simon04 commented 3 years ago

The relevant API is haswbstatement, see https://www.mediawiki.org/wiki/Help:Extension:WikibaseCirrusSearch#haswbstatement

Example: https://www.wikidata.org/w/api.php?action=query&format=json&list=search&srsearch=haswbstatement:P496=0000-0002-1003-5675


{"batchcomplete":"","query":{"searchinfo":{"totalhits":1},"search":[{"ns":0,"title":"Q8134165","pageid":8085821,"size":16343,"wordcount":159,"snippet":"Mike Taylor\nMike Taylor\n\u041c\u0430\u0458\u043a \u0422\u0435\u0458\u043b\u043e\u0440\nMike Taylor\nMike Taylor\nMike Taylor\nMike Taylor\nMike Taylor\nMike Taylor\nMike Taylor\nMike Taylor\nMike Taylor\nMike Taylor","timestamp":"2020-11-24T06:08:20Z"}]}}```
tomayac commented 3 years ago

I unfortunately won't have time to build this myself, but if anyone here is versed to add it, I'm happy to review and merge a Pull Request that adds the feature.