JonathanReeve / data-ethics-literature-review

An automated survey of literature and curricula surrounding ethics in data science. WIP.
http://data-ethics.tech
GNU General Public License v3.0
1 stars 1 forks source link

Look up texts in Semantic Scholar and use that data to augment our bibliographic data #27

Open JonathanReeve opened 3 years ago

JonathanReeve commented 3 years ago

This is a sub-task of #21.

Let's use the Semantic Scholar API to resolve texts, and use the data retrieved from that API to augment our own bibliographic data.

The easy ones will be ones that already have stable identifiers, like DOIs:

deText:DH2SFT9V a z:UserItem ;
    res:resource [ a bibo:Book ;
            dcterms:creator _:fa209d3f7f19943c992157cabc55ce5e6b28 ;
            dcterms:date "2017" ;
            dcterms:publisher [ a foaf:Organization ;
                    foaf:name "Philosophy & Technology" ] ;
            dcterms:title "―Five Kinds of Cyber Deterrence.‖" ;
            bibo:authorList [ a rdf:Seq ;
                    rdf:_1 _:fa209d3f7f19943c992157cabc55ce5e6b28 ] ;
            z:extra "DOI: 10.1007/s13347-016-0251-1" ] .

So we should probably start by doing these first. For other ones, we'll need to construct a query with as much data as possible, so, search by title and author, and maybe other things, as well.

We should probably use the titles and authors that Semantic Scholar returns, rather than our own, which might (as in this example) have extra stuff like punctuation or stray marks.

Ideally we'd have some way of judging which is the correct metadatum (title, author, etc), by looking at each, but that's probably beyond the scope of this project.

JonathanReeve commented 3 years ago

@Zhuohan-Amber, can you take this one? I'll assign it to you, if you don't mind.

JonathanReeve commented 3 years ago

Here is the beginning of an example where I'm querying the CrossRef API for bibliographic data that I'm then using to augment the graph.

Zhuohan-Amber commented 3 years ago

@Zhuohan-Amber, can you take this one? I'll assign it to you, if you don't mind.

Sure. I can work on this issue.

JonathanReeve commented 3 years ago

@Zhuohan-Amber, just leave a note here if you have any questions about this issue, or how best to approach it.

Zhuohan-Amber commented 3 years ago

Based on given DOI, I can use semantic scholar to extract author, title, citations, url and other keys. I will try to figure out how to get DOI from coursesAndTexts.ttl