petermr / pygetpapers

a Python version of getpapers
Apache License 2.0
78 stars 9 forks source link

EPMC: Building corpus using the existing metadata JSON #29

Open ShweataNHegde opened 2 years ago

ShweataNHegde commented 2 years ago

Given I have a eupmc_results.json, can't I build the corpus from scratch? I tried doing it using:

pygetpapers -o tomato --restart -x

I had the JSON file within the tomato folder. But all I get is empty PMC folders. Here's a portion of the tree:

C:.
│   eupmc_results.json
│
├───PMC3193516
├───PMC3466413
├───PMC3790869
├───PMC4032488
├───PMC4364678
├───PMC4375501
├───PMC4445982
├───PMC4464248
...

I think this is an useful functionality to add, if it doesn't exist.