ceurws / lod

Anything we need to maintain the Linked Open Data (LOD) publication of CEUR-WS.org
16 stars 2 forks source link

Synchronize with Wikidata #25

Open WolfgangFahl opened 2 years ago

WolfgangFahl commented 2 years ago

see sample cases for:

WolfgangFahl commented 2 years ago
# 
# get CEUR-WS Proceedings records by Volume
# 
# WF 2022-08-13
#
# the Volume number P478 is sometimes available with the proceedings item and sometimes as a qualifier
# of 
#  
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX schema: <http://schema.org/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?item ?itemLabel ?itemDescription
  ?ceurwspart
  ?sVolume
  ?Volume
  ?short_name
  ?event
  ?eventLabel
  ?title
  ?language_of_work_or_name ?language_of_work_or_nameLabel
  ?URN_NBN ?URN_NBNUrl
  ?publication_date
  ?fullWorkUrl
  ?described_at_URL
  ?homePage

WHERE {
  ?item rdfs:label ?itemLabel.
  FILTER(LANG(?itemLabel) = "en")
  OPTIONAL { 
    ?item schema:description ?itemDescription.
    FILTER(LANG(?itemDescription) = "en")
  }

  # Instance of Proceedings
  ?item wdt:P31 wd:Q1143604.
  # Part of the series
  ?item p:P179 ?partOfTheSeries.
  # CEUR Workshop proceedings
  ?partOfTheSeries ps:P179 wd:Q27230297.

  # Volume directly with the proceeedings
  OPTIONAL {
    ?item wdt:P478 ?Volume.
  }
  # Volumes via a a qualifier of the part of the series relation
  OPTIONAL {
    ?partOfTheSeries pq:P478 ?sVolume.
  }
  # Acronym
  OPTIONAL {
    ?item wdt:P1813 ?short_name.
  }
  OPTIONAL {
    ?item wdt:P4745 ?event.
    ?event rdfs:label ?eventLabel.
    FILTER(LANG(?eventLabel) = "en")
  }
  # Title
  OPTIONAL {
    ?item wdt:P1476 ?title.
  }
  # Language
  OPTIONAL {
    ?item wdt:P407 ?language_of_work_or_name.
    ?language_of_work_or_name rdfs:label ?language_of_work_or_nameLabel.
    FILTER(LANG(?language_of_work_or_nameLabel) = "en")
  }
  # The URN shouldn't be optional
  OPTIONAL {
    ?item wdt:P4109 ?URN_NBN.
    wd:P4109 wdt:P1630 ?URN_NBNFormatterUrl.
    BIND(IRI(REPLACE(?URN_NBN, '^(.+)$', ?URN_NBNFormatterUrl)) AS ?URN_NBNUrl).
  }
  # publication date
  OPTIONAL {
    ?item wdt:P577 ?publication_date.
  }
  # full work available at
  OPTIONAL {
    ?item wdt:P953 ?fullWorkUrl
  } 
  # described at url
  OPTIONAL {
    ?item wdt:P973 ?described_at_URL.
  }
  # homepage -> replace with full work available at
  OPTIONAL {
    ?item wdt:P856 ?homePage
  }
} ORDER BY xsd:integer(?sVolume)

try it!

fnielsen commented 2 years ago

Excellent idea. I noted that down as an issue for Scholia https://github.com/WDscholia/scholia/issues/1438 but if it is handle elsewhere we do not need to implement it in Scholia.

fnielsen commented 2 years ago

Would it also entails the metadata about the article?

WolfgangFahl commented 2 years ago

We'll start with the proceedings then work thru the submitters and editors and papers.

The basic analysis work to collect as much metadata as possible has been done multiple times between 2014 and today but the results didn't have a permanent sink so far. That's why i want to put an emphasis on the synchronization issue with wikidata to give the metadata a permanent and LOD accessible home.

see:

WolfgangFahl commented 2 years ago

see also CEUR-WS Volume Browser

WolfgangFahl commented 2 years ago

I think the main issue will be the dblp author name disambiguation and synchronization since CEUR-WS editors have a dblp footprint by definition but not necessarily a wikidata entry yet. Is there a dblp synchronization issue in scholia and a matching tool available?

fnielsen commented 2 years ago

"Is there a dblp synchronization issue in scholia and a matching tool available?"

No. Not as far as I remember.