WDscholia / scholia

Wikidata-based scholarly profiles
https://scholia.toolforge.org
Other
223 stars 81 forks source link

On curation page for work aspect, add panel on taxa tagged as main subject to find other works that have their taxon names in the title but no corresponding main subject tags #1901

Open Daniel-Mietchen opened 2 years ago

Daniel-Mietchen commented 2 years ago

What kind of panel would you like to add to which Scholia aspect?

Something that makes use of existing topic tagging for taxa to facilitate more of it.

What kind of information should the panel provide, and which of the visualization options (e.g. table, bubble chart, map) should it use?

Here is a draft query:

PREFIX target: <http://www.wikidata.org/entity/Q62087108>

SELECT DISTINCT 
?item ?title ?taxonname
  (REPLACE(STR(?item), ".*Q", "Q") AS ?work) 
  ("P921" AS ?main_subject)
  (REPLACE(STR(?topic), ".*Q", "Q") AS ?taxon)
  ("S887" AS ?heuristic)
  ("Q69652283" AS ?deduced)

WHERE
{
  target: wdt:P921 ?topic .
  ?topic wdt:P225 ?taxonname .
  SERVICE wikibase:mwapi
  {
    bd:serviceParam wikibase:endpoint "www.wikidata.org";
                    wikibase:api "Generator";
                    mwapi:generator "search";
                    mwapi:gsrsearch ?taxonname;
                    mwapi:gsrlimit "max".
    ?item wikibase:apiOutputItem mwapi:title.
  }
  ?item wdt:P1476 ?title .
  MINUS {?item wdt:P921 ?topic }

  FILTER (REGEX(LCASE(?title), LCASE(CONCAT( "\\", "b", ?taxonname ,"\\", "b"))))
}
LIMIT 200

Screenshot 2022-03-08 at 22-34-37 Wikidata Query Service

Which Wikidata entries would be good candidates to explore such visualizations?

Works tagged with one or more common taxa as main subject: https://w.wiki/4vrq .

Anything else?

I'd like to add some haswbstatementcommands to the gsrsearch call but am not sure this is possible, and if it is, how.

Daniel-Mietchen commented 2 years ago

Perhaps filter for species level by default:

  target: wdt:P921 ?topic .
  ?topic wdt:P225 ?taxonname .
  ?topic wdt:P105 wd:Q7432 . # this filters for species-level taxon names; comment out this line if you'd like to see other ranks too
Daniel-Mietchen commented 2 years ago

It's stabilizing, though still with issues around subspecies, variants, cultivars, hybrids and similar:

# For /work/
PREFIX target: <http://www.wikidata.org/entity/Q111196184>

SELECT DISTINCT 
?item ?title ?taxonname
  (REPLACE(STR(?item), ".*Q", "Q") AS ?work) 
  ("P921" AS ?main_subject)
  (REPLACE(STR(?topic), ".*Q", "Q") AS ?taxon)
  ("S887" AS ?heuristic)
  ("Q69652283" AS ?deduced)

WHERE
{
  target: wdt:P921 ?topic .
  ?topic wdt:P225 ?taxonname .
  ?topic wdt:P105 wd:Q7432 . # this filters for species-level taxon names; comment out this line if you'd like to see other ranks too
  SERVICE wikibase:mwapi
  {
    bd:serviceParam wikibase:endpoint "www.wikidata.org";
                    wikibase:api "Generator";
                    mwapi:generator "search";
                    mwapi:gsrsearch ?taxonname;
                    mwapi:gsrlimit "max".
    ?item wikibase:apiOutputItem mwapi:title.
  }
  ?item wdt:P1476 ?title .
  MINUS {?item wdt:P921 ?topic }

#   FILTER (?taxonname != "No").

  FILTER (REGEX(LCASE(?title), LCASE(CONCAT( "\\", "b", ?taxonname ,"\\", "b"))))
}
LIMIT 200
Daniel-Mietchen commented 2 years ago

Here is also a variant for author:

# For /author/

PREFIX target: <http://www.wikidata.org/entity/Q6389837>

SELECT DISTINCT 
# ?item ?title ?taxonname
  (REPLACE(STR(?item), ".*Q", "Q") AS ?work) 
  ("P921" AS ?main_subject)
  (REPLACE(STR(?topic), ".*Q", "Q") AS ?taxon)
  ("S887" AS ?heuristic)
  ("Q69652283" AS ?deduced)
WITH {
  SELECT DISTINCT ?work WHERE {
    target: ^wdt:P50 ?work .
  }
} AS %works
WITH {
  SELECT DISTINCT ?topic ?taxonname WHERE {
    INCLUDE %works
    ?work wdt:P921 ?topic .
    ?topic wdt:P225 ?taxonname .
    ?topic wdt:P105 wd:Q7432 . 
  }
} AS %taxonnames
WHERE
{
  INCLUDE %taxonnames
  SERVICE wikibase:mwapi
  {
    bd:serviceParam wikibase:endpoint "www.wikidata.org";
                    wikibase:api "Generator";
                    mwapi:generator "search";
                    mwapi:gsrsearch ?taxonname;
                    mwapi:gsrlimit "max".
    ?item wikibase:apiOutputItem mwapi:title.
  }
  ?item wdt:P1476 ?title .
  MINUS {?item wdt:P921 ?topic }

#   FILTER (?taxonname != "No").

  FILTER (REGEX(LCASE(?title), LCASE(CONCAT( "\\", "b", ?taxonname ,"\\", "b"))))
}
# LIMIT 200