datalad / datalad-crawler

DataLad extension for tracking web resources as datasets
http://datalad.org
Other
5 stars 16 forks source link

figshare crawler #20

Open yarikoptic opened 5 years ago

yarikoptic commented 5 years ago

In datalad "core" we already have export_to_figshare which uses figshare API. We could/should also make use of it to provide crawling of the datasets. It should be done via API instead of website scraping due to possibly having multiple versions per dataset, and extensive use of JS. A good candidate for a sample dataset would be BOLD5000 which currently has 3 versions.