red6 / pdfcompare

A simple Java library to compare two PDF files
Apache License 2.0
220 stars 66 forks source link

How to measure exclusion areas? #81

Closed SoltauFintel closed 4 years ago

SoltauFintel commented 4 years ago

I'm evaluating this lib. I use DPI 150 for faster processing. Questions:

a) I want to exclude one value from one page. How can I measure the area of my value in my PDF in a real world scenario?

Yes, I can print out getDifferencesJson. {"page":134,"x1":394,"y1":258,"x2":499,"y2":645} But two values (=2 rects) are combined into one area. And even there would be 2 areas, they're not big enough if the value gets wider.

b) Is there any text extraction or comparison? e.g. does my actual PDF contain "4711.0815"?

finsterwalder commented 4 years ago

To a) You are right, the "difference area" that the library prints out is always the combined area, that includes all differences. Getting the right exclusion areas is a little bit trial and error. You can specify exclusions in pixels, but also in mm or cm. That might be easier. And then you can see the excluded area as yellow in the output pdf, to verify your exclusions.

In my company I created a web application that showed a pdf and allowed to draw squares for exclusion areas visually. But that is embedded into am app, that I can't open source. :-/

To b) No, PdfCompare only works graphically and does not extract any text. To do that, take a look at the underlying PdfBox library.

SoltauFintel commented 4 years ago

Thank you for your prompt reply. Let me think about it. Maybe I can contribute such a tool as PR. Please leave the ticket open for another week.

SoltauFintel commented 4 years ago

A simple solution could look like this: The user uses the Display class and activates an xy mode there. The mouse pointer changes to a cross. With every click the x, y position is copied to the clipboard. The user can insert the coordinate in his Java code (withIgnore).

        leftPanel.addMouseListener(new MouseListener() {
            @Override
            public void mouseClicked(MouseEvent e) {
                // TODO if xy mode is active ... (with cross cursor)
                double f = 300 / dpi;
                String xy = (int) (e.getX() / leftPanel.zoom / f) + ", " + (int) (e.getY() / leftPanel.zoom / f);

                Clipboard c = Toolkit.getDefaultToolkit().getSystemClipboard();
                c.setContents(new StringSelection(xy), null);
            }
finsterwalder commented 4 years ago

Sounds like a simple idea. Could you create a merge request for that? Some improvements I could think of: 1) Open a dialog that shows the coordinates, which includes a copy button. 2) It would be even cooler to allow users to drag a rectangle by holding the mouse button and then get four coordinates. ;-)

gastendonk commented 4 years ago

This issue can be closed after merge of PR #83.