alexgand / springer_free_books

Python script to download all Springer books released for free during the 2020 COVID-19 quarantine
GNU General Public License v3.0
1.64k stars 366 forks source link

Link to the downloaded books #119

Open kangli-bionic opened 4 years ago

kangli-bionic commented 4 years ago

The script is not easy to run to completion if downloading all books. Finally was able to finish downloading on my fifth attempt. So to save people some time I'm sharing a link to the downloaded books.

Currently missing the last 13 titles on the excel file.

https://drive.google.com/drive/folders/1JC15m__PbPaowQ7k2zS1-Us72yvROCQs?usp=sharing

valahna commented 4 years ago

For those who want the recently released 1000 books for summer and to learn how I approached downloading any set that is freely available:

I grabbed the CSV report from the search page that contains the links and some meta data. I parsed the URLs and DOIs into direct download links, and then created two text files: one for the pdfs and one for the epubs version. I then imported these files into downthemall (download manager extension), which then proceeded to download all one thousand of these books. Not all have an epub version, so some will fail in that regard. I kept the simultaneous download limit to 8-10 at a time, and the it worked fine, is this due to the extension acting as a complete browser and handling the cookies, headers, and all that for you, or because it was limited the amount downloaded at a time to prevent being flagged as a bot/script, I don't know. Further testing and data would need to be done to determine this.

Then I wrote a script to parse the csv and update the PDFs with the meta data using exiftool, and renamed the files to something besides the DOI. I compressed them into three files, one with the PDFs and two with the epubs. You can find the csv I used and the archives with all the books here: Mega Hosted

To chaosAD's point, my approach is certainly a more "cat and mouse" approach, and not as elegant and refined as a script that handles all of this for you; however, I think it is a little impractical for someone to visit each books' springer page and click on two donwload buttons for all one thousand of these titles.

CyclopeanBee commented 4 years ago

The script is not easy to run to completion if downloading all books. Finally was able to finish downloading on my fifth attempt. So to save people some time I'm sharing a link to the downloaded books.

Currently missing the last 13 titles on the excel file.

https://drive.google.com/drive/folders/1JC15m__PbPaowQ7k2zS1-Us72yvROCQs?usp=sharing

I have 11 of the missing books! The final two didn't have download links any more when I checked.

AntoineSoetewey commented 3 years ago

Hello @kangli-bionic,

Can you confirm that your google drive link and the books will remain accessible as long as possible?

I would like to include your link at the top of this article, so I would like to make sure books are not removed soon.

Thanks again for having downloaded the books!

Regards, Antoine