denysvitali / squid_decoder

A decoder for the Squid page format
20 stars 3 forks source link

Pages just White #6

Open Cobular opened 4 years ago

Cobular commented 4 years ago

Hey! I just came across this tool and really was hoping it would work, but alas something seems to have broken in the last 2 years. Some of the PDFs are coming out entirely white when I check them in chrome, and every page of their conversion looks like this: image. Each page is also only 1kb in size, which doesn't seem right. I have also attached a sample PDF if that helps. I did have to change the following code from

https://github.com/denysvitali/squid_decoder/blob/a5992bce40f71b8c22db1efa01ca841c4944f494/papyrus.py#L15-L23

to

# This is a quick fix to check whether we can use pyPdf (depreciated) or PyPDF2
import pip
from PyPDF2 import PdfFileWriter, PdfFileReader

since the call to pip.get_installed_distributions() was erroring and the internet said that you can't call methods on pip from inside your code for some reason. However, it does seem to be working mostly, which is strange and a bit worrisome.

Combined PDF: 20180118_-_Ch.14-_History_Notes.pdf

Functional Page: page1.pdf

If you could help me, I would be very appreciative, but if this project is totally dead, then I understand. Either way, thanks!

EDIT: Looked into it more, that one document is a really weird page size. Could that be it?

denysvitali commented 4 years ago

Hey!
Thanks for opening this issue! Can you please send me the original file so that I can try to reproduce the problem? What version of Squid are you using?

Cobular commented 4 years ago

It seems I totally forgot to follow up on this, sorry! I worked it out in the end, it was my bad. Thanks tho!

denysvitali commented 4 years ago

No problem! Can you elaborate a little bit more on what the issue was so that it could be helpful for other users too? (: