phulin / rebook

A collection of tools for cleaning up book scans.
132 stars 30 forks source link

a solution for a bug in binarization.py su2013 algo #2

Open ghsama opened 5 years ago

ghsama commented 5 years ago

Hey, in the su2013 implementation we have : def su2013(im, gamma=0.25): W = 5 horiz = cv2.getStructuringElement(cv2.MORPH_RECT, (W, 1)) vert = cv2.getStructuringElement(cv2.MORPH_RECT, (1, W)) I_min = cv2.erode(cv2.erode(im, horiz), vert) I_max = cv2.dilate(cv2.dilate(im, horiz), vert) diff = I_max - I_min C = diff.astype(np.float32) / (I_max + I_min + 1e-16) alpha = (im.std() / 128.0) ** gamma C_a = alpha * C + (1 - alpha) * diff _, C_a_bw = cv2.threshold(C_a, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU) return C_a_bw

the _, C_a_bw = cv2.threshold(C_a, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU) need Ca to be 'uint8' type which is not the case (float64), to resolve it we just need to cast it : `, C_a_bw = cv2.threshold(C_a.astype('uint), 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU) `

the algo becomes : def su2013(im, gamma=0.25): W = 5 horiz = cv2.getStructuringElement(cv2.MORPH_RECT, (W, 1)) vert = cv2.getStructuringElement(cv2.MORPH_RECT, (1, W)) I_min = cv2.erode(cv2.erode(im, horiz), vert) I_max = cv2.dilate(cv2.dilate(im, horiz), vert) diff = I_max - I_min C = diff.astype(np.float32) / (I_max + I_min + 1e-16) alpha = (im.std() / 128.0) ** gamma C_a = alpha * C + (1 - alpha) * diff _, C_a_bw = cv2.threshold(C_a.astype('uint8'), 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU) return C_a_bw

ps : Thanks a lot for this wonderful work 👍 👍

phulin commented 5 years ago

Thanks for figuring this out. Can you send a PR?

ghsama commented 5 years ago

Sorry for the delay, i will make a PR as soon as i can