GCC15 / bilibili-captcha

Recognize CAPTCHA generated by bilibili.com
MIT License
116 stars 32 forks source link

A problem of remove_noise_with_hsv #4

Closed zlhaa23 closed 9 years ago

zlhaa23 commented 9 years ago

The current procedure of remove_noise_with_hsv is as follows.

  1. Find the 2nd most frequent color (the standard color), (std_h, std_s, std_v)
  2. For each pixel in the original CAPTCHA image, calculate its deviation from the standard color, (delta_h, delta_s, delta_v)
  3. If delta_h <= h_tol && delta_s <= s_tol && delta_v <= v_tol, set the new grayscale value to be 1 - delta_v, or 0 otherwise.

    Problem

This method may produce very different thicknesses for some CAPTCHA images.

Case A 00 origin 01 hsv

Case B 00 origin 01 hsv

The original chars appear to have almost identical thicknesses; after this procedure, the char in Case A is ~2x as thick as in Case B.

origin hsv

If this is fixed...