BoiseState-AdaptLab / OCR_4_Forest_Service

Other

2 stars 0 forks source link

Try new image enhancement techniques #28

Closed FloCiaglia closed 3 years ago

FloCiaglia commented 3 years ago

The paper goes over new preprocessing techniques that we have not tried.

[x] Create new tests in the test_char_detection folder to implement the new techniques;

Implements Story #27

FloCiaglia commented 3 years ago

Image Enhancement

Step 1 - Binarization (Thresholding)

We have always used the otsu's algorithm for binarization. The paper explains how global thresholding techniques (like otsu) perform better when the background is uniform. Whereas, local thresholding techniques (like the threshold_adaptive function), tends to perform better when the background color varies throughout the form.

My first step is to try this new thresholding technique.

UPDATE: the local thresholding techniques (MEAN and GAUSSIAN) did not perform well on our data. We'll stick to global thresholding with OTSU

FloCiaglia commented 3 years ago

Step 2 - Noise Reduction

To reduce noise in the images, I have tried two different techniques:

Gaussian Blurring: We had tried this technique before and decided that we didn't need it in our pipeline. I will explore this option further.
Median filtering: This technique didn't yield good results on our dataset.

FloCiaglia commented 3 years ago

Step 3 - Skew Detection and Correction

I don't think we need to add this step to our pipeline because our text is already horizontal in the picture. I believe this step would make our code more complex and, at the same time, not provide a significant improvement.

FloCiaglia commented 3 years ago

Step 4 - Character Segmentation

This step is where everything happens. The performance of the pipeline and the model classification depends on how well our character segmentation step performs.

NOTE: One thing I noticed was that when we preprocess the fields, the background becomes white and the foreground becomes black. The black letters often have 1-2 pixels cuts in them, which fragment the letter. This makes it so the next steps don't recognize those edges as a whole letter. I wonder if implementing a simple script that connects pixels that are 1 pixel apart would help to improve the performance of this step. I will implement this on issue #29