red6 / pdfcompare

A simple Java library to compare two PDF files
Apache License 2.0
220 stars 66 forks source link

False Diff - When I compare PDF and if it has images as formulae and they are relatively light font then that is highlighted as a difference #123

Open rajeshatuce opened 2 years ago

rajeshatuce commented 2 years ago

Hey,

I am struggling with 2 false diff issue at the moment:

  1. I have 2 PDF which has many formulae as PDF which are relatively light font. Of that 1 of the formulae is getting highlighted as a difference even though on manual verification I dont find it different at all. Any idea what can be be done to resolve this issue so that its not highlighted as a difference.
  2. In 2 PDF file there is 1-2 pixel difference in space possibly. Very light red, green line like pipe symbol is getting printed in diff PDF. - which I can't spot at all with human eye is getting highlighted as difference. Any idea what can be done to fix this issue?

Thanks for reply

rajeshatuce commented 2 years ago

Update - If I retry couple of times then same comparison passes. Any idea why is the issue? I am using version 1.1.61 ?

finsterwalder commented 2 years ago

No idea really. It all depends on the pixel-rendering that the PDF-Library PdfBox does. Embedded Elements may be tricky to render for the library.

nisckis commented 2 years ago

Good day, we have been having the same random fails and they have disappeared when we disabled the parallel processing by adding parallelProcessing=false to the configuration.

We tested the random fails by comparing multiple times the same files via the CLI, after disabling parallel processing none of the executions reported an incorrect difference again.

rkgour7492 commented 1 year ago

@nisckis Is there a way to set this flag via CLI ?

finsterwalder commented 1 year ago

Please check the readme: https://github.com/red6/pdfcompare#configuring-pdfcompare -DparallelProcessing=false should do it.