Open yashlamba opened 2 years ago
This could be a simple weird cropping issue @sakshamarora1.
Hey @yashlamba I encountered the same thing, but I found that if I converted my scanned image to a gray-scale image and then only used pixels that were below a certain darkness threshold it improved the quality.
Original on the left and cleaned version on the right:
Font improvement:
The code I wrote is here, happy to help make a PR if you like.
scanned_jpg = os.path.join(self.sheets_path, "with_lines.jpg")
cleaned_image = cv2.imread(scanned_jpg)
original_image_in_gray = cv2.cvtColor(cleaned_image, cv2.COLOR_BGR2GRAY)
threshold_level = 150
# To get specific coordinates of the pixels, use the following line:
# coords = np.column_stack(np.where(gray < threshold_level))
# Create mask of all pixels lower than threshold level
mask = original_image_in_gray < threshold_level
# Color the pixels in the mask
cleaned_image[mask] = (0, 0, 0)
# Remove all pixels not in the mask
cleaned_image[~mask] = (255, 255, 255)
# Convert jpg_img to cv2.COLOR_BGR2GRAY
updated_image = cv2.cvtColor(cleaned_image, cv2.COLOR_BGR2GRAY)
both_images = np.concatenate((original_image_in_gray, updated_image), axis=1)
cv2.imshow('Side by Side', both_images)
cv2.waitKey()
Hi @chriscohoat, The new output looks much better (We also use something very similar but a different approach) but the threshold value you used is something we are trying to figure out.
If you look here: https://github.com/cod-ed/handwrite/blob/440f59153fe02fc96503fd87f9c64b105c8ceb71/handwrite/sheettopng.py#L79-L92
We default to a threshold of 200, if I am not wrong and it is configurable, @sakshamarora1 will have more context on this. We also have #3 open for a long time to fix this. In short, the problem we have is that the threshold that worked for your input might not work for others, maybe figuring it out dynamically based on the input would be interesting.
More related to #68.
Hi @chriscohoat . The new output looks a lot better! As @yashlamba said we need to figure out how to dynamically set the best threshold value in order to get better results just like you did.
Also, the default threshold_value is configurable here in the config json file: https://github.com/cod-ed/handwrite/blob/440f59153fe02fc96503fd87f9c64b105c8ceb71/handwrite/default.json#L1-L2
This was initially done so that everyone can set it accordingly as each input is different. I've opened an issue to track this since I missed it before - #70
Initial input:
The above input didn't work, tried with increased exposure:
The font was generated fine using handwrite-web (handwrite 0.2.1), but it had 2 issues:
68
cc @programmer290399 Found while testing: #67