blochberger / sokman

Manages your SoKs
ISC License
5 stars 2 forks source link

Add Support for Importing Bibtex Files #1

Open josepaiva94 opened 3 years ago

josepaiva94 commented 3 years ago

I have a list of relevant publications in a bibtex file. How can I start snowball process from such list?

blochberger commented 3 years ago

You could use bibtexparser to process the file in Python and create the necessary objects (Author and Publication), see for example:

https://github.com/blochberger/sokman/blob/c7848600ba3f61c55b25521e8f2c855a314cc1b2/sok/management/commands/dblpimport.py#L446-L484

If you already use DBLP cite keys in your file, you can simply import the publications with

./manage.py dblpimport 'DBLP:conf/ease/PetersenFMM08' 'DBLP:conf/ccs/EgeleBFK13'

You can provide as many cite keys in the single command. Add --use-api flag, if you do not have a DBLP dump and want to fetch a live version from the server. But be aware that a request is made for each key provided.

Note that the Semantic Search API integration that is used for snowballing works based on the DOI (or Semantic Search paper ID). If you have a DOI in your bibtex file, it should work out of the box, else you might need to assign the IDs accordingly. You could search DBLP for the publication title to semi-automatically identify related DBLP entries (or more specifically DOIs). You could do the same with the Semantic Search API.

josepaiva94 commented 3 years ago

I have converted the bibtex entries into an SQL insert statement on sok_publications. Then, I run ./manage.py repair and I can now ./manage.py snowball :-)

blochberger commented 3 years ago

I think the use-case is still interesting and is worth supporting. Hence, I keep the issue open as a reminder.

josepaiva94 commented 3 years ago

The small script for anyone facing the same use-case https://gist.github.com/josepaiva94/c91834935923e8394aa19ed766d8fa51 (DOIs are mandatory!)