petermr / pygetpapers

a Python version of getpapers
Apache License 2.0
78 stars 9 forks source link

pygetpapers fails to download PDF (Err 502) #20

Closed petermr closed 3 years ago

petermr commented 3 years ago

Describe the bug pygetpapers -p raises a 502 error and the PDF is the error message (see #19)

To Reproduce

(base) pm286macbook:gwas pm286$  pygetpapers -q "aardvark" -k 10 -p -o pyaardvark
INFO: Final query is aardvark
INFO: Total Hits are 411
0it [00:00, ?it/s]WARNING: Author list not found for paper 4
WARNING: Abstract not found for paper 5
WARNING: Keywords not found for paper 8
WARNING: Keywords not found for paper 10
1it [00:00, 333.99it/s]
100%|█████████████████████████████████████████████████████████████████| 10/10 [00:08<00:00,  1.19it/s]
(base) pm286macbook:gwas pm286$ ls -lt pyaardvark/*/*.pdf
-rw-r--r--  1 pm286  staff  507 16 Jul 15:06 pyaardvark/PMC6920383/fulltext.pdf
-rw-r--r--  1 pm286  staff  507 16 Jul 15:06 pyaardvark/PMC7165416/fulltext.pdf
-rw-r--r--  1 pm286  staff  507 16 Jul 15:06 pyaardvark/PMC7021774/fulltext.pdf
-rw-r--r--  1 pm286  staff  507 16 Jul 15:06 pyaardvark/PMC7952090/fulltext.pdf
-rw-r--r--  1 pm286  staff  507 16 Jul 15:06 pyaardvark/PMC8242802/fulltext.pdf
-rw-r--r--  1 pm286  staff  507 16 Jul 15:06 pyaardvark/PMC7645055/fulltext.pdf
-rw-r--r--  1 pm286  staff  507 16 Jul 15:06 pyaardvark/PMC8051583/fulltext.pdf
-rw-r--r--  1 pm286  staff  507 16 Jul 15:06 pyaardvark/PMC7460251/fulltext.pdf
-rw-r--r--  1 pm286  staff  507 16 Jul 15:06 pyaardvark/PMC7831365/fulltext.pdf
-rw-r--r--  1 pm286  staff  507 16 Jul 15:06 pyaardvark/PMC7358442/fulltext.pdf
(base) pm286macbook:gwas pm286$  pygetpapers -v
INFO: 0.0.6.3

The PDF files contain the error message. Expected behavior A clear and concise description of what you expected to happen.

getpapers works correctly


(base) pm286macbook:gwas pm286$ ls -lt aardvark/*/*.pdf
-rw-r--r--  1 pm286  staff  4246177 16 Jul 15:03 aardvark/PMC8242802/fulltext.pdf
-rw-r--r--  1 pm286  staff  2023463 16 Jul 15:03 aardvark/PMC7021774/fulltext.pdf
-rw-r--r--  1 pm286  staff   967358 16 Jul 15:03 aardvark/PMC7952090/fulltext.pdf
-rw-r--r--  1 pm286  staff  1248439 16 Jul 15:03 aardvark/PMC7460251/fulltext.pdf
-rw-r--r--  1 pm286  staff  1934416 16 Jul 15:03 aardvark/PMC8211746/fulltext.pdf
-rw-r--r--  1 pm286  staff   680106 16 Jul 15:03 aardvark/PMC7165416/fulltext.pdf
-rw-r--r--  1 pm286  staff   122414 16 Jul 15:03 aardvark/PMC7645055/fulltext.pdf
-rw-r--r--  1 pm286  staff  4345595 16 Jul 15:03 aardvark/PMC7358442/fulltext.pdf
-rw-r--r--  1 pm286  staff  1226715 16 Jul 15:03 aardvark/PMC7831365/fulltext.pdf
-rw-r--r--  1 pm286  staff   772353 16 Jul 15:03 aardvark/PMC8051583/fulltext.pdf

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Note

This was first reported in #19 - that issue is now that any error message should be detected and relayed to the user directly if possible. This issue #20 is about downloading PDFs.

Since getpapers works this must be apygetpapers problem.