fsinf / pdf-page-stripper

Strips useless pages from TU Wien PDFs
https://fsinf.github.io/pdf-page-stripper/
The Unlicense
11 stars 1 forks source link

Detect duplicated pages by visually comparing them #3

Open stefnotch opened 1 year ago

stefnotch commented 1 year ago

Someone finally sent me some PDFs that have duplicated pages where the pages metadata got lost.

Here, the best way of identifying duplicates would probably be:

stefnotch commented 1 year ago

Test_fur_PDF_Stripper.pdf

stefnotch commented 1 year ago

More test files algorithmics_ws22_part1-1.pdf

algorithmics_ws22_part2-1.pdf

algorithmics_ws22_part3-1.pdf