benwbrum commented 6 years ago

Part of the challenges of presenting images for users to transcribe is knowing which images contain meaningful text (census entries) and which are irrelevant to our purposes (microfilm artifacts, district descriptions, total sheets, signatures). We would like software which can classify images which have entries from other images, whether accomplished through computer vision, fuzzy matching on OCR, or other methodologies.

The sample data presently includes

A classification images directory containing sample images which have already been manually classified.
A gold classification file listing each image and a classification as entries or other.

cramraj8 commented 6 years ago

6

When we do the resizing part (430, 250) from the original image of size (3000+, 2000+), we are distorting the image. When we do the classification using CNN, there should be some distinct features that differentiate among the different classes. In our problem, the images are full of text and artifacts. So without correctly interpreting our trained model, how can we assure that it is correctly extracting the distinct features? For instance, we use Grad-CAM to localize the distinct feature.

cramraj8 commented 6 years ago

We have two options,

Switch the classification & detection in the pipeline.
We can upsample the images. But the (1) will not support our purpose of classification in terms of cost. So upsampling might help us.

FreeUKGen / SummerOfCodeImages

Census Image Classification #2

6