MiniGlome / Archive.org-Downloader

Python3 script to download archive.org books in PDF format
896 stars 118 forks source link

Script crashes when it's time to img2pdf. Working for anyone else? #73

Open Tenome opened 1 year ago

Tenome commented 1 year ago

I have all the requirements installed, and I've confirmed that img2pdf is installed and functioning as normal. I'm on Python 3.8.5. Any idea why it might break? It doesn't give an error message, just an appcrash popup.

darnn commented 1 year ago

The developer might give you a better answer, but I would just comment out the img2pdf part (that's what I did), and download the actual images and then use something else to make a PDF out of them. You're better off processing them with ScanTailor Advanced first anyway: https://github.com/vigri/scantailor-advanced

Tenome commented 1 year ago

The developer might give you a better answer, but I would just comment out the img2pdf part (that's what I did), and download the actual images and then use something else to make a PDF out of them. You're better off processing them with ScanTailor Advanced first anyway: https://github.com/vigri/scantailor-advanced

I'm trying to archive a lot of books though, any idea how I could implement another program into a batch?

darnn commented 1 year ago

I don't have one, sadly. But I think you might find better places to ask, if it's just about a program that will collate a bunch of images into a PDF. If you're also interested in OCR, you're really better off processing them first. Keep in mind also that if you want to retain the quality, you're either going to have to binarize them or have PDFs that are going to be hundreds of megabytes.

MiniGlome commented 1 year ago

Have you found a solution to this issue? It seems to be very user-specific as no one else reported this.

darnn commented 1 year ago

Not me...

asheroto commented 11 months ago

Yep got the same issue...

[+] Successful loan
[+] Found 111 pages
Downloading pages...
100%|████████████████████████████████████████████████████████████████████████████████| 111/111 [00:24<00:00, 15.72it/s]
Traceback (most recent call last):
  File "archive-org-downloader.py", line 260, in <module>
    pdf = img2pdf.convert(images, **pdfmeta)
  File "C:\Python37\lib\site-packages\img2pdf.py", line 2658, in convert
    kwargs["pdfa"],
  File "C:\Python37\lib\site-packages\img2pdf.py", line 742, in __init__
    v = ("D:" + datetime_to_pdfdate(v)).encode("ascii")
  File "C:\Python37\lib\site-packages\img2pdf.py", line 725, in datetime_to_pdfdate
    return dt.astimezone(tz=timezone.utc).strftime("%Y%m%d%H%M%SZ")
OSError: [Errno 22] Invalid argument

You can however open the Downloads folder and compile the images into a PDF manually.

nf24eg commented 11 months ago

great code for downloading, I faced some issues but the developer help me out, so I uninstalled everything and reinstalled it again and working again, but sometimes I'm getting the same issue here when it's time for img2pdf to work, here is the error i got for downloading this as an example https://archive.org/details/lifeinpalestinew0000carp Current book: https://archive.org/details/lifeinpalestinew0000carp [+] Successful loan [+] Found 210 pages Downloading pages... 100%|████████████████████████████████████████████████████████████████████████████████| 210/210 [00:40<00:00, 5.15it/s] Traceback (most recent call last): File "C:\Users\XXX\Archive.org-Downloader\archive-org-downloader.py", line 260, in <module> pdf = img2pdf.convert(images, **pdfmeta) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\XXX\AppData\Local\Programs\Python\Python312\Lib\site-packages\img2pdf.py", line 2639, in convert pdf = pdfdoc( ^^^^^^^ File "C:\Users\XXX\AppData\Local\Programs\Python\Python312\Lib\site-packages\img2pdf.py", line 742, in __init__ v = ("D:" + datetime_to_pdfdate(v)).encode("ascii") ^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\XXX\AppData\Local\Programs\Python\Python312\Lib\site-packages\img2pdf.py", line 725, in datetime_to_pdfdate return dt.astimezone(tz=timezone.utc).strftime("%Y%m%d%H%M%SZ") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ OSError: [Errno 22] Invalid argument