Closed ebotiab closed 5 months ago
It could be useful to remove duplicate or nearly duplicate pages inside one or more PDFs. For example:
pdfly rm-dupl with_dupl_rm.pdf with_dupl_pages
One possible approach would be to convert the pdf to images and then remove the ones that have similar image hash.
pypdf is not a "viewer" and can not generate images from pages. This feature can not be achieved.
Makes sense
It could be useful to remove duplicate or nearly duplicate pages inside one or more PDFs. For example:
One possible approach would be to convert the pdf to images and then remove the ones that have similar image hash.