Leseratte10 / acsm-calibre-plugin

Calibre plugin for ACSM->EPUB and ACSM->PDF conversion.
https://www.mobileread.com/forums/showthread.php?t=341975
GNU General Public License v3.0
593 stars 23 forks source link

An unexpected error occurred for XXX.acsm: unsupported operand type(s) for +: 'int' and 'str' #89

Open nozonyan opened 1 month ago

nozonyan commented 1 month ago

Bug description

Some PDFs from the french publisher L'Harmattan exhibit this weird behaviour, yielding the error unsupported operand type(s) for +: 'int' and 'str'

Operating system

Linux, Windows

Which version of Calibre are you running?

7.10

Which version of the ACSM Input plugin are you running?

v0.0.16

Import type

Clicking the 'Add books' button in the menu bar, Dragging-and-Dropping the ACSM file into the Calibre window, Using an auto-add folder (Preferences -> Adding books -> Automatic adding)

Further information

I actually run the plugin standalone, but the files fail to be downloaded even through calibre. It works if I load them in ADE but there is no ADE for linux (hence me using deacsm).

That's a PDF file
Successfully downloaded PDF, patching encryption ...
Searching for startxref ...
Got startxref: 2320493
Found ENC after 10 attempts - took 0 ms
Odd formatting of encryption blob?
If this doesn't work correctly please open a bug report.
Found EBX after 13 attempts - took 0 ms

Encryption handler:
<</Length     948/Type/XRef/Root 255 0 R/Info null/Encrypt 721 0 R/ID[<573990748AEDAF82D140CA3E9381F68E><DD46A38000625928F393A35C2F034853>]/Size 723/Index[0 723]/W[1 3 1]/DecodeParms<</Columns 5/Predictor 12>>/Filter/FlateDecode>>stream
EBX handler:
<</EBX_PUBLISHER(L'Harmattan Edition Diffusion)/Filter/EBX_HANDLER/Length 128/ADEPT_ID(urn:uuid:6da67e3f-3e09-467f-9b9c-365656d16868)/V 4/EBX_TITLE(Les Hyperborens)/EBX_AUTHOR(Grard Lambin;)>>
Trimmed encryption handler:
<</Length     948/Type/XRef/Root 255 0 R/Info null/Encrypt 721 0 R/ID[<573990748AEDAF82D140CA3E9381F68E><DD46A38000625928F393A35C2F034853>]/Size 723/Index[0 723]/W[1 3 1]/DecodeParms<</Columns 5/Predictor 12>>/Filter/FlateDecode>>
Updated EBX handler not logged due to sensitive data
An unexpected error occurred for AN 3776305.pdf.acsm: unsupported operand type(s) for +: 'int' and 'str'
That didn't work!
Leseratte10 commented 1 month ago

The PDF format has quite a few oddities. This issue has happened a couple times in the past when the PDFs in question have weird formatting or something in their encryption data, due to the fact that my plugin isn't using a full PDF parsing library but basically just hex-edits the needed changes into the file.

To debug (and potentially fix) that I would need to have access to the raw unmodified PDF file. The log you posted seems to be incomplete, a couple lines above the ones you posted should be something like "Loading book from xxxx" (when using the plugin) or just a plain URL to a PDF file (when running it standalone). Can you post that URL here? It doesn't contain any personal information (it's the same URL for everyone downloading that particular book).

nozonyan commented 1 month ago

The PDF format has quite a few oddities. This issue has happened a couple times in the past when the PDFs in question have weird formatting or something in their encryption data, due to the fact that my plugin isn't using a full PDF parsing library but basically just hex-edits the needed changes into the file.

To debug (and potentially fix) that I would need to have access to the raw unmodified PDF file. The log you posted seems to be incomplete, a couple lines above the ones you posted should be something like "Loading book from xxxx" (when using the plugin) or just a plain URL to a PDF file (when running it standalone). Can you post that URL here? It doesn't contain any personal information (it's the same URL for everyone downloading that particular book).

this should be the full log

Fulfilling book 'AN 3776305.pdf.acsm' ...
Not notifying any server since that was disabled.
Downloading book 'AN 3776305.pdf.acsm' ...
http://rps2images.ebscohost.com/rpsweb/download_artifact/NL$3776305$PDF/s3381253:36430808:3776305:1716070336529:1882325036/4d9e/b6fa7b07/5ec8279f_download.pdf
Download took 1453 milliseconds
That's a PDF file
Successfully downloaded PDF, patching encryption ...
Searching for startxref ...
Got startxref: 2320493
Found ENC after 10 attempts - took 0 ms
Odd formatting of encryption blob?
If this doesn't work correctly please open a bug report.
Found EBX after 13 attempts - took 0 ms

Encryption handler:
<</Length     948/Type/XRef/Root 255 0 R/Info null/Encrypt 721 0 R/ID[<573990748AEDAF82D140CA3E9381F68E><DD46A38000625928F393A35C2F034853>]/Size 723/Index[0 723]/W[1 3 1]/DecodeParms<</Columns 5/Predictor 12>>/Filter/FlateDecode>>stream
EBX handler:
<</EBX_PUBLISHER(L'Harmattan Edition Diffusion)/Filter/EBX_HANDLER/Length 128/ADEPT_ID(urn:uuid:6da67e3f-3e09-467f-9b9c-365656d16868)/V 4/EBX_TITLE(Les Hyperborens)/EBX_AUTHOR(Grard Lambin;)>>
Trimmed encryption handler:
<</Length     948/Type/XRef/Root 255 0 R/Info null/Encrypt 721 0 R/ID[<573990748AEDAF82D140CA3E9381F68E><DD46A38000625928F393A35C2F034853>]/Size 723/Index[0 723]/W[1 3 1]/DecodeParms<</Columns 5/Predictor 12>>/Filter/FlateDecode>>
Updated EBX handler not logged due to sensitive data
An unexpected error occurred for AN 3776305.pdf.acsm: unsupported operand type(s) for +: 'int' and 'str'
That didn't work!

the script will then generate a tmp_AN 3776305.pdf file

I could also send you the acsm/pdf privately somehow if you use discord.