miller-center / cpc-issues

Connecting Presidential Collections
Other
0 stars 0 forks source link

Deal with scanning bleed-through #6

Open waldoj opened 10 years ago

waldoj commented 10 years ago

We're going to have a lot of pages in which the text on the other side of the page can be seen, faintly, which is going to be distracting.

We'll need to devise a process to identify this, and a second process to eliminate that text. I think that's going to be accomplished by taking the two adjacent pages and attempting to subtract out the contents of that page. I worry that this will eliminate everything, courtesy of bidirectional page-bleed, but perhaps increasing the contrast on those adjacent pages will address that. Then we'll need to automatically identify which of the two resulting images has actually resulted in a better image, since of course one of them will be entirely the wrong image to have used.

waldoj commented 10 years ago

Here's an example of that.

Side 1:

page_bleed

And side 2:

page_bleed_2

waldoj commented 10 years ago

I've noticed that the tilt of the letters is a giveaway. Cursive letters go up and to the right (/), but the bleed-through goes down (\).