MiniGlome / Archive.org-Downloader

Python3 script to download archive.org books in PDF format
864 stars 116 forks source link

IndexError: list index out of range #117

Open nf24eg opened 4 months ago

nf24eg commented 4 months ago

I'm getting this error now

1 Book(s) to download [+] Successful login

Current book: https://archive.org/details/biblestoriesfrom0000amer [+] Successful loan Traceback (most recent call last): File "C:\Users\NAS\Archive.org-Downloader\archive-org-downloader.py", line 213, in title, links, metadata = get_book_infos(session, url) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\NAS\Archive.org-Downloader\archive-org-downloader.py", line 20, in get_book_infos infos_url = "https:" + r.split('bookManifestUrl="')[1].split('"\n')[0]


IndexError: list index out of range
mikerusuk commented 3 months ago

I am getting the same error

abdelkhalak commented 2 months ago

same here

L0que commented 1 month ago

Me too FWIW.

karasmith441 commented 1 month ago

It seems that bookManifestUrl is no longer returned upon a get request, but the book manifest does still appear in an html element. I do not know how robust this solution is, but it has worked for me for multiple books.

In archive-org-downloader.py add the following imports at the top of the script

import json
from bs4 import BeautifulSoup

(you may need to install json and bs4, in which case add them to requirements.txt and rerun the pip command in the readme). Then replace the line that gives the index error with infos_url = "https:" + json.loads(BeautifulSoup(r, 'html.parser').find("input", {"class", "js-bookreader"})['value'])['url']