Closed rossmounce closed 9 years ago
It's useful to give as much diagnostic output as possible. I get:
localhost:junk pm286$ getpapers -q 'dinosaurs' --api eupmc -p -o pdf_test_eupmc
info: Searching using eupmc API
info: Found 769 open access results
Retrieving results [==============================] 100% (eta 0.0s)
info: Done collecting results
info: Duplicate records found: 750 unique results identified
info: Saving result metdata
info: Full EUPMC result metadata written to eupmc_results.json
info: Extracting fulltext HTML URL list (may not be available for all articles)
warn: Article with pmcid "PMC4439161" had no fulltext HTML url
warn: Article with pmcid "PMC4373905" had no fulltext HTML url
warn: Article with pmcid "PMC4468865" had no fulltext HTML url
warn: Article with pmcid "PMC4452486" had no fulltext HTML url
...
warn: Article with pmcid "PMC3192393" had no fulltext PDF url
info: Downloading fulltext PDF files
Downloading files [=======================-------] 75% (eta 0.2s)
/Users/pm286/.nvm/v0.10.38/lib/node_modules/getpapers/lib/eupmc.js:333
fourohfour();
^
TypeError: undefined is not a function
at /Users/pm286/.nvm/v0.10.38/lib/node_modules/getpapers/lib/eupmc.js:333:11
at /Users/pm286/.nvm/v0.10.38/lib/node_modules/getpapers/node_modules/got/index.js:152:6
at BufferStream.<anonymous> (/Users/pm286/.nvm/v0.10.38/lib/node_modules/getpapers/node_modules/got/node_modules/read-all-stream/index.js:52:3)
at BufferStream.emit (events.js:117:20)
at finishMaybe (/Users/pm286/.nvm/v0.10.38/lib/node_modules/getpapers/node_modules/got/node_modules/read-all-stream/node_modules/readable-stream/lib/_stream_writable.js:460:14)
at afterWrite (/Users/pm286/.nvm/v0.10.38/lib/node_modules/getpapers/node_modules/got/node_modules/read-all-stream/node_modules/readable-stream/lib/_stream_writable.js:340:3)
at /Users/pm286/.nvm/v0.10.38/lib/node_modules/getpapers/node_modules/got/node_modules/read-all-stream/node_modules/readable-stream/lib/_stream_writable.js:327:9
at process._tickCallback (node.js:448:13)
localhost:junk pm286$
localhost:pdf_test_eupmc pm286$ ls -lt | wc
733 6590 43943
localhost:pdf_test_eupmc pm286$ wc fulltext_html_urls.txt
18 19 778 fulltext_html_urls.txt
localhost:pdf_test_eupmc pm286$ wc eupmc_results.json
294467 593731 7207505 eupmc_results.json
Looks like you missed the crash in some way.
@rossmounce the reason you get different numbers of results for those two queries is because they are different queries! The first one searches only PLOS and gets 325 unique results. The second one searches the whole of EPMC and gets >700 results.
If you use the same query with and without -p
you get the same number of results.
The problem with the HTML url lists is a separate issue and has been fixed in 3155ce6.
Why can't getpapers metadata-only supply the user a list of dinosaur-related fulltext URLs from PLOS ONE?
(edit: same for PeerJ & eLife. Even when doing metadata only searches, I would like/expect getpapers to output a fulltext_urls.txt file)
The JSON file from the above metadata only query returns 325 items.
Compare this with the search with added
-p
, where the JSON file contains 750 records, and the url file contains 18, and it downloaded ~33 PDFs. Super inconsistent!