JetBrains-Research / pubtrends

Scientific literature explorer. Runs a Pubmed or Semantic Scholar search and allows user to explore high-level structure of result papers
Apache License 2.0
38 stars 2 forks source link

Investigate BioArxiv data availability #61

Open olegs opened 5 years ago

olegs commented 5 years ago

If everything fails, we can go with beatiful soup crawling, see: https://github.com/OmnesRes/prepub

ctrltz commented 5 years ago

Summary about working with bioRxiv:

About bioRxiv in general:

olegs commented 5 years ago

Can you please remind us the full size of webRxiv?

olegs commented 5 years ago

Crawler is available in dedicated branch: biorxiv