icyphy / icyphy.github.io

Other
0 stars 1 forks source link

#6 Use crawler to transfer publications from icyphy.org to the new site. #8

Closed lsk567 closed 4 years ago

lsk567 commented 4 years ago

6 Since we do not have access to the database that stores the publications for the original site, it would make sense to use a web crawler to scrape data from https://ptolemy.berkeley.edu/projects/icyphy/ instead of using manual effort. The crawler first looks at the publications collection page for each year; then it goes into each publication link to retrieve information.

The idea is to use a general schema to store a publication (title, journal, year, etc.) and generate citation styles on the fly, instead of storing three citation styles directly in the publication entry, so that there is less burden on data entry. To achieve this, the crawler looks at the bibtex of each publication, which is the closest representation to a general schema, and parses the bibtex info using a BibTeX parser. It then generates .md files in the _publications directory.

lhstrh commented 4 years ago

Merging PR into master.