Closed Bladieblah closed 1 year ago
Very Nice !!
I've made a simple test to write the images to png
test
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import os
from xpydf.pdf_loader import PdfLoader
from PIL import Image
filename = os.path.join( 'test_data' , 'xpdf_tests_password.pdf')
loader = PdfLoader( filename, user_password = 'userpassword' )
# pdf_text = loader.extract_strings()
pdf_info = loader.extract_page_info()
for page in pdf_info:
pdf_images = loader.extract_images( page[ 'page_number'] )
for idx, image in enumerate( pdf_images):
img = Image.fromarray( image, 'RGB' )
img.save( f'''image_{page[ 'page_number']}_{idx}.png''' )
print( f'''image_{page[ 'page_number']}_{idx}.png written''' )
Output:
3 images
3 images
image_1_0.png written
image_1_1.png written
image_1_2.png written
P.S. i think the "3 images" info are for testing purposes ?
@ReMiOS I added some more tests, and added another file encoding. This should cover all possible image types, but no guarantees since pdfs are unpredictable and I havent tested on a lot of data yet :)
If you want you can try it yourself a bit more before I merge, if I have time I'll do some larger tests as well
I'll do some more testing. First impressions look good
Update: I've done more testing with various PDF's and i could not find any unexpected results.
Sample pdf's containing images: https://www.appsloveworld.com/download-sample-pdf-files-for-testing https://www.learningcontainer.com/sample-pdf-files-for-testing/
Nice! I've run some tests as well and it seems to work well
@ReMiOS images that require
drawImageMask
need 1 small fix but most images should work nowCloses #12