alexgand / springer_free_books

Python script to download all Springer books released for free during the 2020 COVID-19 quarantine
GNU General Public License v3.0
1.65k stars 366 forks source link

Fixed the HTTP Content-Length issue #102

Closed chaosAD closed 4 years ago

chaosAD commented 4 years ago

The Content-Length is not necessarily present in HTTP header[1], so the server might send a valid book content without it. Therefore, the code should not depend on this Content-Length info. I tweaked wildmichael's code to display the pretty progress bar for individual book download only if the Content-Length exists, otherwise it just skips displaying the progress bar but continues to download.

The code also guesses if the link is a book or an HTML (not the best solution, though). If it is the latter, it raises an error and skips to the next title. This fixes #98.

Reference [1] https://www.w3.org/Protocols/HTTP/1.0/draft-ietf-http-spec.html#Content-Length

alexgand commented 4 years ago

Hi @chaosAD , I merged this PR, but it breaks the code for me:

image

chaosAD commented 4 years ago

Apparently, Springer updated the Excel file with an empty entry (at index 293), which breaks the code. It does not only break the latest commit in this PR, but all other older commits. I have fixed it in my PR #104.