sparkfish / shabby-pages

ShabbyPages is a state-of-the-art corpus of born-digital document images with both ground truth and distorted versions appropriate for use in training models to reverse distortions and recover to original denoised documents.
MIT License
50 stars 6 forks source link

Cluster images #3

Closed proofconstruction closed 2 years ago

proofconstruction commented 2 years ago

We want to form clusters of images via perceptual hashing or other, based on their similarity.

Apricot is one option, this method is another.

proofconstruction commented 2 years ago

I think we abandoned this for now.