mitsuhiko / flatex-pdf-download

some code to toy with that can download flatex.at pdfs
Apache License 2.0
43 stars 8 forks source link

PDFs damaged when downloading with script #5

Closed kaulex99 closed 1 year ago

kaulex99 commented 1 year ago

When I download the pdf documents with the script, all of them are damaged and can't be opened. Is there anyone else facing this issue or is it only me?

mitsuhiko commented 1 year ago

Please provide more information, in this form this cannot be debugged.

kaulex99 commented 1 year ago

@mitsuhiko Sorry, here is more information:

  1. MacOS Ventura 13.2 (22D49)
  2. Python 3.11.1
  3. YES
  4. Current main branch (checked out yesterday)
mitsuhiko commented 1 year ago

@kaulex99 would you mind mailing me one of the corrupted PDFs?

kaulex99 commented 1 year ago

@mitsuhiko No, I will send you a general one via email.

mitsuhiko commented 1 year ago

I looked at the PDF and it's not a PDF, but a HTML error page ("Blocked"). It's because flatex blocked the script again. Needs investigation.

kaulex99 commented 1 year ago

Maybe some error handling in the script would be great, just to not download all the pdfs if they are broken. Because the script output looks like if nothing went wrong in general.

gaambo commented 1 year ago

FYI: having the same problem. I'm using Ubuntu 22 (in WSL 2), Python 3.10.6 using current main branch. I'm tried using the SESSIONID param as well as user+password combination.

mitsuhiko commented 1 year ago

Happy to accept PRs. Sadly I don't have much time to deal with this myself.

floriangaller commented 1 year ago

maybe #8 solves this issue

gaambo commented 1 year ago

maybe #8 solves this issue

i just checked out the PR locally and it worked for me (using session-id and flatex.at). thank you 👍

kaulex99 commented 1 year ago

@mitsuhiko can you please merge :D

mitsuhiko commented 1 year ago

Merged. Thank you.