Closed Wikunia closed 9 years ago
Hi Ole, thanks for the feedback. I've noticed this every now and then, but nothing consistent. Do you have a pdf you could share that manifests this error so I can take a stab at tracing it down?
On Thursday, February 12, 2015, Ole Kröger notifications@github.com wrote:
Hi,
first of all thanks for this awesome program. Unfortunately I have a problem: When I search a string inside the _ocr pdf I found the matches but the highlighted part is always around 5 cm (I know that's not the best unit :D ) under the real match. [image: pypdfocr] https://cloud.githubusercontent.com/assets/4931746/6173083/89040d28-b2e5-11e4-8369-32d76272b46e.png
— Reply to this email directly or view it on GitHub https://github.com/virantha/pypdfocr/issues/27.
Hi I can't share the pdf with you but I will looking for another one :) Stay tuned! These are my logs and btw the cpu usage doesn't look normal...
Starting conversion of 2015.pdf
WARNING: X-dpi is 16, Y-dpi is 22, defaulting to 300
convert: unable to extent pixel cache `Cannot allocate memory' @ fatal/cache.c/CacheSignalHandler/3333.
WARNING: Could not run command convert "2015_18.jpg" -respect-parenthesis \( -clone 0 -colorspace gray -negate -lat 15x15+5\% -contrast-stretch 0 \) -compose copy_opacity -composite -opaque none +matte -modulate 100,100 -blur 1x1 -adaptive-sharpen 0x2 -negate -define morphology:compose=darken -morphology Thinning Rectangle:1x30+0+0 -negate "2015_preprocess.jpg"
Making pool
Completed conversion successfully to 2015_ocr.pdf
pypdfocr 2015.pdf 1176.63s user 16.66s system 345% cpu 5:45.50 total
Well it looks fine on any other pdf I tried so don't worry. The important thing is that I can now search inside the pdf! Thanks!!!
By any chance, was the "bad" one in landscape and the rest portrait orienation?
On Thu, Feb 12, 2015 at 1:28 PM, Ole Kröger notifications@github.com wrote:
Well it looks fine on any other pdf I tried so don't worry. The important thing is that I can now search inside the pdf! Thanks!!!
— Reply to this email directly or view it on GitHub https://github.com/virantha/pypdfocr/issues/27#issuecomment-74124356.
The "bad" one is a pdf generated generated with latex I think and it looks like an 1:1 ratio.
Hi,
first of all thanks for this awesome program. Unfortunately I have a problem: When I search a string inside the
_ocr
pdf I found the matches but the highlighted part is always around 5 cm (I know that's not the best unit :D ) under the real match.