Open Daniel-Mietchen opened 6 years ago
There is now a Listeria list up at https://www.wikidata.org/wiki/Wikidata:WikiProject_Source_MetaData/Wikidata_lists/Author_name_strings_matched_to_author_items_using_Stated_As .
I tried to adapt that to an institutional context but did not get it to work properly, so here's what I have so far:
SELECT
# Author of a paper with a "stated as" statement for authorship
?item
# Sample work by the author with that "stated as" value
?pub
# Build URL to the Author disambiguator tool
(CONCAT(
'[https://tools.wmflabs.org/author-disambiguator/?doit=Look+for+author&name=',
ENCODE_FOR_URI(?authorstring), ' ',?authorstring , ']') AS ?string_resolver)
# Number of works with an author name string that matches the one above
?count
WITH {
SELECT
(COUNT(?work) AS ?count)
?authorstring
WHERE {
?work wdt:P2093 ?authorstring .
?work wdt:P50 ?author .
{ ?author wdt:P108 / wdt:P361* wd:Q213439 .}
UNION
{ ?author wdt:P463 / wdt:P361* wd:Q213439 .}
UNION
{ ?author wdt:P1416 / wdt:P361* wd:Q213439 .}
FILTER(!regex (?authorstring, "^[A-Za-z]{1}.\\s")).
}
GROUP BY ?authorstring ?item
} AS %result
WITH {
SELECT DISTINCT ?authorstring ?item #(SAMPLE(?work1) AS ?pub)
?count
WHERE {
INCLUDE %result
?work1 p:P50 ?author_statement .
?author_statement ps:P50 ?item .
?author_statement pq:P1932 ?authorstring .
}
GROUP BY ?authorstring ?item #?pub
?count
} AS %stateds
WHERE {
INCLUDE %stateds
}
ORDER BY DESC(?count)
LIMIT 200
Probably needs some more finetuning with regexes, so I played a bit with https://regex101.com/ .
A simple change of the regex to
FILTER(regex (?authorstring, "^(?=^[A-Z][a-z]{1,}.*)(?=.*[a-z]$).*$")).
seems to make this query useful.
Will set up a Listeria page for the query now.
i.e. if we have an item with a P2093 (author name string) of "Smith J. W. X." or perhaps even "Smith J W X" or similar, the tool would look for paper items with a P50 (author) statement and a P1932 (stated as) qualifier "Smith J. W. X." and suggest these authors as potential candidates for switching those P2093 statements to P50 ones.