lmbringas / packtpub-downloader

Script to download all your books from PacktPub inspired by https://github.com/ozzieperez/packtpub-library-downloader
267 stars 84 forks source link

Invalid URL #20

Open SirMikeDouglas opened 5 years ago

SirMikeDouglas commented 5 years ago

The script works for the most part, but fails randomly. It will download many PDFs and ZIPs then suddenly fail with the error below. When I go to my media directory I will see a file 0 byte file size, delete it, run script with exact same parameters and it downloads just fine....until it randomly fails again with same error. If it means anything I am using Python 3.7 and I see you are working on 3.6.

File "C:\Python\Python37\lib\site-packages\requests\models.py", line 387, in prepare_url raise MissingSchema(error) equests.exceptions.MissingSchema: Invalid URL '': No schema supplied. Perhaps you meant http://?

amal-khailtash commented 5 years ago

Same here:

ERROR (please copy and paste in the issue)
{'message': 'Internal server error'}
502
Starting to download /packtpub/books/xyz.epub
Traceback (most recent call last):
  File "packtpub-downloader/main-new.py", line 229, in <module>
    main(sys.argv[1:])
  File "packtpub-downloader/main-new.py", line 221, in main
    download_book(filename, url)
  File "packtpub-downloader/main-new.py", line 104, in download_book
    r = requests.get(url, stream=True)
  File "lib/python3.6/site-packages/requests/api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "lib/python3.6/site-packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "lib/python3.6/site-packages/requests/sessions.py", line 519, in request
    prep = self.prepare_request(req)
  File "lib/python3.6/site-packages/requests/sessions.py", line 462, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "lib/python3.6/site-packages/requests/models.py", line 313, in prepare
    self.prepare_url(url, params)
  File "lib/python3.6/site-packages/requests/models.py", line 387, in prepare_url
    raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL '': No schema supplied. Perhaps you meant http://?
BurnhamG commented 5 years ago

Hi there, are both of you using the most recent version from this repo? I'll take a look at the version you're using and let you know if I can figure out why we're seeing this bug.

amal-khailtash commented 5 years ago

I am on the latest version of the repo.

SirMikeDouglas commented 5 years ago

Latest repo too. I wrote that the same day I DL'ed and installed. BTW-Kudos again, cool stuff. It's like grandma's cookies; still damn good even with a bug or two. :)

BurnhamG commented 5 years ago

Thank you both, I've added a check for any server errors, so it shouldn't even try to download a book (and leave you with an empty file). Below are links to a version that is this repo's master branch with slight changes to address these problems, as well as a version that includes parallel downloading (from PR #18).

Current master with patch: https://github.com/BurnhamG/packtpub-downloader/tree/error_fix

Patch with parallel downloads: https://github.com/BurnhamG/packtpub-downloader/tree/async

If you find any bugs or run into any problems just let me know - I haven't had the chance to write tests for the updated code, so my "testing" has just been running the script a few times.

amal-khailtash commented 5 years ago

I reran the script today and I do not see this error anymore!