Closed AbdBarho closed 5 years ago
Data model based on wikidata model:
! non-commented arrows signify 'subclass of'
problems:
research wikidata data model to find corresponding properties to match our columns' names and publications.
Frequency | Column in PIK data set | Wikidata property | Comment | |
---|---|---|---|---|
8261 | title | title P1476 | ||
8258 | keywords | |||
8235 | year | publication date P577 | ||
8080 | authors | author P50 | ||
7796 | publisher | publisher P123 | ||
6299 | startpage | number of pages P1104 | merge startpage and endpage, map to number of pages | |
6034 | endpage | number of pages P1104 | ||
4493 | journal | academic journal Q737498 | only paperr, papern, newspaper and inbook have entry for journal; link article to journal | |
4468 | x4 ( = DOI / Identifier) | DOI P356 | ||
3879 | vol | volume P478 | ||
3462 | issue | issue P433 | ||
2922 | place | place of publication P291 | ||
1656 | editors | editor P98 | ||
1516 | booktitle | only inbook, inreport, confpaper, proceedings, epup have entry for booktitle | ||
1340 | relation (= Serie) | part of the Series P179 | ||
974 | link | |||
921 | comment | |||
385 | conference |
Questions: Where do we add P1433 venue (published in (not place))? Where do we add P921 topic (main subject)? What do we do with missing PIK properties?
P1433: There some inconsistencies in how the data looks like and how Scholia requests it.
for example, in Scholia, in author.html
we see the following request:
?work wdt:P1433 ?venue .
where as on the official page of P1433, the description of the item says : larger work that a given work was published in, like a book, journal or music album.
then again, a venue is the physical place where it was published, but in this case it is used as part of
, maybe it is just a naming problem.
P921: we have the column keywordsAndPeerReview
(also named x1 ( =Feld ""Keyword""; u.a. belegt mit Info zu peer-review, wenn kein ISI-Journal)
) which might be a good candidate.
For the most part we can ignore the remaining PIK properties to deliver the first prototype at time, additional input from PIK is needed for how important is this information and how it ties with the other values we have.
author inconsistencies: taking the following example query for the author Didier Musso:
select ?work where {
?work wdt:P50 wd:Q24244119 .
}
we can see that has published some work and Scholia would recongnize him as an author, however, on his wikidata page, we find no mention of the class author anywhere. the only link is through the occupation property which has the value researcher
, which in itself a subclass of creator
(author
is also a subclass of creator
)
This further leads to the assumption that all the data model is build on relationships between the different items instead of class hierarchies.
a publication is an instance of (P31) one of the following items:
paperr
: article Q191067papern
: article Q191067inbook
: chapter Q1980247confpaper
: conference paper Q23927052lecture
: lecture Q603773report
: report Q10870555epup
: electronic publication Q21572908inreport
: research report Q59387148intseries
: technical report Q3099732book
: book Q571newspaper
: newspaper article Q2495037edbook
: edited volume Q1711593data
: data publication Q17051824software
: software project Q63437139dipl
: diploma thesis Q30749496habil
: habilitation thesis Q1414362thesis
: doctoral thesis Q187685proceedings
: proceedings Q1143604author
: Author Q482980
author
: a work has author (P50) propertyeditor
: Editor Q1607826
editor
: a work has editor (P98) propertypublisher
: Publisher Q2085381
publisher
: a work is published (P123) by a Publisherjournal
: academic journal Q737498
published in
: a publication is published in (P1433) a journalissue
: (literal) a publication has an issue (P433)vol
: (literal) a publication has a volume (P478)startpage & endpage
: literal
number of pages
a publication has number of pages (P1104)DOI
: literal DOI (P356)year
: literal publication date (P577)[---]
: a place where a publication was published (country/city/etc...)
place
: a publication has property place of publication P291 to the place where it was published[---]
: a series of publications
series
: a publication is part of the series (P179)[---]
: anything...
keywords
: a publication has a list of main subjects (P921)
depends on #18 and #17