skotz / cbl-js

JavaScript CAPTCHA solving library
MIT License
155 stars 47 forks source link

Solving when two characters touch #32

Open openseauser opened 5 years ago

openseauser commented 5 years ago

I'm trying to train this thing to solve an easy captcha, but getting stuck a little. If the captcha has two characters that are touching then there's no combination I can try to get it to pick up both characters.

Do you know how this could be done? Maybe I'm missing a setting that should be changed? Some examples are attached.

The first one won't recognize the 5 and a, second one won't recognize the m and the d, and third one won't recognize the p or the 4.

8 3 4

skotz commented 5 years ago

Since this CAPTCHA always has the same number of characters, you could play with the exact_characters setting. That'll repeatedly split the largest extracted character in half until you have exactly that many characters.

var cbl = new CBL({
    preprocess: function(img) {
        img.debugImage("debugPreprocessed");
        img.binarize(32);
        img.debugImage("debugPreprocessed");
        img.colorRegions(50, true, 1);
        img.debugImage("debugPreprocessed");
    },
    character_set: "0123456789abcdefghijklmnopqrstuvwxyz",
    exact_characters: 6,
    pattern_width: 24,
    pattern_height: 24,
    blob_min_pixels: 1,
    blob_max_pixels: 10000,
    allow_console_log: true,
    blob_console_debug: true,
    blob_debug: "debugSegmented"
});

This seems to get fairly good results. With a little work on the preprocessing steps you should be able to get near 100% accuracy on this.

image

openseauser commented 5 years ago

Works for me, thank you!

skotz commented 4 years ago

Just fixed bug #33 which will improve the segmentation for this one.