ContentMine / getpapers

Get metadata, fulltexts or fulltext URLs of papers matching a search query
MIT License
197 stars 37 forks source link

-k cannot go below 100 hits #94

Open petermr opened 8 years ago

petermr commented 8 years ago

For demos it is useful to download a small number of files. -k seems to have a minimum of 100:

larger than 100, uses -k value

localhost:projects pm286$ getpapers -q camel -o camel -x -k 123
info: Searching using eupmc API
info: Found 2213 open access results
info: Limiting to 123 hits
Retrieving results [==============================] 100% (eta 0.0s)
info: Done collecting results
info: Duplicate records found: 122 unique results identified
info: Saving result metadata
info: Full EUPMC result metadata written to eupmc_results.json
info: Extracting fulltext HTML URL list (may not be available for all articles)
info: Fulltext HTML URL list written to eupmc_fulltext_html_urls.txt
warn: Article with pmid "26866228 did not have a PMCID (therefore no XML)
info: Got XML URLs for 121 out of 122 results
info: Downloading fulltext XML files
Downloading files [==============================] 100% (121/121) [1.9s elapsed, eta 0.0]
info: All XML downloads succeeded!

smaller than 100 uses 100:

localhost:projects pm286$ getpapers -q camel -o camel -x -k 23
info: Searching using eupmc API
info: Found 2213 open access results
info: Limiting to 23 hits
Retrieving results [==============================] 100% (eta 0.0s)
info: Done collecting results
info: Saving result metadata
info: Full EUPMC result metadata written to eupmc_results.json
info: Extracting fulltext HTML URL list (may not be available for all articles)
info: Fulltext HTML URL list written to eupmc_fulltext_html_urls.txt
info: Got XML URLs for 100 out of 100 results
info: Downloading fulltext XML files
Downloading files [==============================] 100% (100/100) [2.0s elapsed, eta 0.0]
info: All XML downloads succeeded!
tarrow commented 8 years ago

This is solved in pull request #87 if you want to take a look