skotz / cbl-js

JavaScript CAPTCHA solving library
MIT License
155 stars 47 forks source link

How about this one? #20

Closed c4shm4st3r closed 6 years ago

c4shm4st3r commented 6 years ago

hello, i have a harder challange, how would i use the library to presegment and actually solve this single character captcha?

It is a single number from 1-9

I have a working method but i'd like to do with this method just for curiosity of how to get it working with javascript.

verimage-477749927 verimage-905817473 verimage-427168729

Those images are just a small portion of the big dataset as an example. How would i get rid of the trashy background ?

Thanks in the future

skotz commented 6 years ago

What you really need here is an erosion filter, but unfortunately I haven't implemented that yet. You can try to get by with repeated blurring and binarization since that'll erode away edges until only the thickest lines (the numbers) are left. It's a hack, but the only other option is to implement more image manipulation methods.

var cbl = new CBL({
    preprocess: function(img) {
        img.blur(5);
        img.debugImage("debugPreprocessed");
        img.binarize(45);
        img.debugImage("debugPreprocessed");
        img.blur(5);
        img.debugImage("debugPreprocessed");
        img.binarize(45);
        img.debugImage("debugPreprocessed");
        img.blur(5);
        img.debugImage("debugPreprocessed");
        img.binarize(45);
        img.debugImage("debugPreprocessed");
        img.blur(5);
        img.debugImage("debugPreprocessed");
        img.binarize(45);
        img.debugImage("debugPreprocessed");
        img.blur(5);
        img.debugImage("debugPreprocessed");
        img.binarize(45);
        img.debugImage("debugPreprocessed");
        img.blur(5);
        img.debugImage("debugPreprocessed");
        img.binarize(45);
        img.debugImage("debugPreprocessed");
        img.blur(5);
        img.debugImage("debugPreprocessed");
        img.binarize(45);
        img.debugImage("debugPreprocessed");
        img.colorRegions(40, true);
        img.debugImage("debugPreprocessed");
    },
    character_set: "0123456789",
    blob_min_pixels: 1000,
    blob_max_pixels: 10000,
    pattern_width: 128,
    pattern_height: 128,
    pattern_maintain_ratio: true,
    allow_console_log: true,
    perceptive_colorspace: true,
    blob_debug: "debugSegmented",
    model_loaded: function() {
        document.getElementById("aSolve").style.display = "inline-block";
    }
});

cbl.train("3.png");
cbl.train("2.png");
cbl.train("1.png");

3 1 2 3 png 4 download

c4shm4st3r commented 6 years ago

damn, im learning quiet a lot with this library, thanks for the reply, i will play around with it.

c4shm4st3r commented 6 years ago

just one more question, if the captcha character is not in the dataset, does it still detect similarities?

Imagine this, i have 30 different numbers 3's but the new one comes different from all of those, does it still detect similarities? how about inclinations?

skotz commented 6 years ago

If you train a model on several samples of every possible character (say, 5 of each digit), then new instances of that number (not used during training) should have a high likelihood of being correctly classified. Does that answer the question?

skotz commented 6 years ago

So... I added a convolution filter method, meaning now you can play around with erosion filters (Google "convolution kernels" to find common ones). If you get the latest from master, you can now do this:

var cbl = new CBL({
    preprocess: function(img) {
        img.debugImage("debugPreprocessed");
        img.convolute([ [  1,   1,   1,   1,   1,   1,   1],
                        [  1,   1,   1,   1,   1,   1,   1],
                        [  1,   1,   1,   1,   1,   1,   1],
                        [  1,   1,   1, -48,   1,   1,   1],
                        [  1,   1,   1,   1,   1,   1,   1],
                        [  1,   1,   1,   1,   1,   1,   1], 
                        [  1,   1,   1,   1,   1,   1,   1] ]);
        img.debugImage("debugPreprocessed");
        img.cropRelative(0, 5, 0, 5);
        img.debugImage("debugPreprocessed");
        img.colorRegions(2, true);
        img.debugImage("debugPreprocessed");
    },
    character_set: "0123456789",
    blob_min_pixels: 2000,
    blob_max_pixels: 10000,
    pattern_width: 32,
    pattern_height: 32,
    pattern_maintain_ratio: true,
    allow_console_log: true,
    perceptive_colorspace: true,
    blob_debug: "debugSegmented",
    model_loaded: function() {
        document.getElementById("aSolve").style.display = "inline-block";
    }
});

Which gives really good results, if I do say so myself.

1 2 3

Here's how it segments the three samples you had.

c b a

c4shm4st3r commented 6 years ago

Thanks a lot for the support, im now trying to use a bigger dataset.

I will try, thanks <3 <3

skotz commented 6 years ago

I updated the library itself to add the new convolute command, so you'll need to download the latest version for that last snippet to work.