BIDS-projects / scraper

Collects data from websites of data science institutions
2 stars 0 forks source link

finished pdf spider #14

Closed ExandTran closed 8 years ago

ExandTran commented 8 years ago

Need to work on not getting blocked on google.

The spider basically looks for author in google scholar and downloads the the pdf of the paper. The pdf is then converted to a txt filed via xpdf.

Note: You need to have xpdf installed.