WDscholia / scholia

Wikidata-based scholarly profiles
219 stars 78 forks source link

On Scholia homepage, show timeline of relevant stats #624

Open Daniel-Mietchen opened 5 years ago

Daniel-Mietchen commented 5 years ago

This would complement #336 , which does not provide timelines.

I don't see a way to do this for arbitrary queries, but for each property, https://www.wikidata.org/wiki/Template:Property_uses keeps track of their usage in a way that can be accessed through SPARQL, as per the "usage history" link on each property's talk page.

I have taken this for P921 and simplified it a bit, which makes it faster:

# Chart of P921 usage
# Note: this chart is based on https://www.wikidata.org/wiki/Template:Property_uses
# which is updated once a day by PLbot

SELECT ?day ?count WITH { SELECT (".+\\|921=(\\d+).+" as ?r) ("|921=" as ?p)
  (IF(CONTAINS(?r1, ?p), xsd:integer(REPLACE(?r1, ?r, "$1")), -1) AS ?c1) (xsd:dateTime(?t1) AS ?d1)
  (IF(CONTAINS(?r11, ?p), xsd:integer(REPLACE(?r11, ?r, "$1")), -1) AS ?c11) (xsd:dateTime(?t11) AS ?d11)
  (IF(CONTAINS(?r21, ?p), xsd:integer(REPLACE(?r21, ?r, "$1")), -1) AS ?c21) (xsd:dateTime(?t21) AS ?d21)
  (IF(CONTAINS(?r31, ?p), xsd:integer(REPLACE(?r31, ?r, "$1")), -1) AS ?c31) (xsd:dateTime(?t31) AS ?d31)
  (IF(CONTAINS(?r41, ?p), xsd:integer(REPLACE(?r41, ?r, "$1")), -1) AS ?c41) (xsd:dateTime(?t41) AS ?d41)
  (IF(CONTAINS(?r51, ?p), xsd:integer(REPLACE(?r51, ?r, "$1")), -1) AS ?c51) (xsd:dateTime(?t51) AS ?d51)
  { SERVICE wikibase:mwapi {
      bd:serviceParam wikibase:api "Generator" ; wikibase:endpoint "www.wikidata.org" ; mwapi:generator "allpages" ; 
                      mwapi:gapfrom "Property_uses" ; mwapi:gapto "Property_uses" ; mwapi:gapnamespace "10" ; 
                      mwapi:prop "revisions" ; mwapi:rvprop "content|timestamp" ; mwapi:rvlimit "51" ; mwapi:rvuser "PLbot" .
      ?t1 wikibase:apiOutput "revisions/rev[1]/@timestamp" . ?r1 wikibase:apiOutput "revisions/rev[1]/text()" .
      ?t11 wikibase:apiOutput "revisions/rev[11]/@timestamp" . ?r11 wikibase:apiOutput "revisions/rev[11]/text()" .
      ?t21 wikibase:apiOutput "revisions/rev[21]/@timestamp" . ?r21 wikibase:apiOutput "revisions/rev[21]/text()" .
      ?t31 wikibase:apiOutput "revisions/rev[31]/@timestamp" . ?r31 wikibase:apiOutput "revisions/rev[31]/text()" .
      ?t41 wikibase:apiOutput "revisions/rev[41]/@timestamp" . ?r41 wikibase:apiOutput "revisions/rev[41]/text()" .
      ?t51 wikibase:apiOutput "revisions/rev[51]/@timestamp" . ?r51 wikibase:apiOutput "revisions/rev[51]/text()" .
} as %revs {
  {BIND(?c1 AS ?count) BIND(?d1 AS ?day) INCLUDE %revs} UNION
  {BIND(?c11 AS ?count) BIND(?d11 AS ?day) INCLUDE %revs} UNION
  {BIND(?c21 AS ?count) BIND(?d21 AS ?day) INCLUDE %revs} UNION
  {BIND(?c31 AS ?count) BIND(?d31 AS ?day) INCLUDE %revs} UNION
  {BIND(?c41 AS ?count) BIND(?d41 AS ?day) INCLUDE %revs} UNION
  {BIND(?c51 AS ?count) BIND(?d51 AS ?day) INCLUDE %revs}
  FILTER(?count != -1)

On that basis, we can think about expanding the query to show timelines for several properties in one graph. Obvious candidates: P577 (publication date), P1476 (title), P50 (author), P2093 (author name string), P2860 (cites), P108 (affiliation), P625 (geolocation) and some of the key identifiers, e.g. P356 (DOI), P496 (ORCID).

Daniel-Mietchen commented 5 years ago

Here is a version with six WikiCite properties: version with six WikiCite properties

Six seems to be the current maximum that does not normally cause a timeout.