Stirling-Tools / Stirling-PDF

#1 Locally hosted web application that allows you to perform various operations on PDF files
https://stirlingpdf.com
MIT License
46.83k stars 3.83k forks source link

Functionality Request: Multiple Images on one PDF page #1400

Open github-cli opened 5 months ago

github-cli commented 5 months ago

I often scan different cards like insurance cards, drivers license, government id, etc. My request is to detect the pictures on a pdf or take two pictures (or more) and align them in order on a pdf page (not just one image per page)? Preferably I would mostly select two images which would be combined into a PDF which is cropped to the images but would print in the original size.

sorydi3 commented 3 months ago

I can work on this. Could you please let me know which existing features should be enhanced with this feature request, if any? @Frooodle

Frooodle commented 3 months ago

So this seems like they want to detect images from scans Not extract from a digital pdf file Meaning it would need some computer vision to detect what is photos We do have some python code that does this but not very well

I guess this code could be enhanced to "merge images into new pdf" as a checkbox with maybe options how many per page etc

sorydi3 commented 3 months ago

Hello! I've been working on this issue and came up with a simpler solution that I believe will work well. My approach involves creating a predefined template for the layout, which determines how the images are placed within the PDF file. The images are then arranged according to the user-selected template. I have a pull request ready with more details.

The scanning part, however, seems a bit tricky to address at the moment.

Frooodle commented 3 months ago

Scanning as mentioned already exists here https://stirlingpdf.io/extract-image-scans

Frooodle commented 3 months ago

But i think what you have done as enhancement to image to PDF is better place for it, users should extact images separately, then use image to PDF

github-cli commented 3 months ago

Thanks a lot guys :) This fulfills what the topic of the feature request is but not the use-case I described.

If that use-case is negligible for most thats fine but I will just elaborate in case its something others might be interested in: Simply said, I scan (mostly credit card sized) documents that have two sides that need scanning.

Currently I scan them separately on two PDF pages (I could scan to two separate images as well) and then use the crop function to only have the area of the document. this ends up creating two pages but keeps the size information, so if I print it out it will print the real size of the document (and I can select to print both pages on a single piece of paper if I ever need to print it). having one document on a single PDF page would be even better but the cropping down to only the image(s) gives me the great advantage of saving those documents (I keep them in an app called paperless-ngx as well as on nextcloud) and being able to show them on my smartphone when needed (often) without having to zoom in accidentally moving into an area without the full document (meaning without empty/blank areas).

I think with the choice of how many images should be distributed on a page joint with an automatic cropping, that would end up being even better than what I do right now but that would need some kind of auto-detection on the location and size of the image. This way I could select just two images for one document or select a whole bunch and select two per page and have one document per page (e.g. all health insurance cards of each family member, one per page... or all identification cards (like drivers license, government ID, company ID, etc.) of one person but one card per page)

Frooodle commented 3 months ago

Scanning as mentioned already exists here https://stirlingpdf.io/extract-image-scans

This ^ is used to try to auto grab images that are scanned be that cards or docs etc..

Right now it expects a white background and tries to grab images that are there, I use it to scan two 6x4 photos and then I can extract them to 2 separate images to save on scan times