ShabbyPages is a state-of-the-art corpus of born-digital document images with both ground truth and distorted versions appropriate for use in training models to reverse distortions and recover to original denoised documents.
MIT License
48
stars
6
forks
source link
Added code to convert pdf into images and split clean/dirty images. #73
The added code should be working in both window and linux OS, tested in my own machine and linux server here: