skotz / cbl-js

JavaScript CAPTCHA solving library
MIT License
155 stars 47 forks source link

Please help me configure SVG captcha #44

Open uitcode opened 4 years ago

uitcode commented 4 years ago

I am currently trying to decode this type of SVG image captcha (https://www.npmjs.com/package/svg-captcha). I tried to configure but still could not filter out the characters, they only set about 60%, some of them were blurred. Hope to receive your help 1 2 3 5 6 7 8 9 10 4

skotz commented 4 years ago

This is a fun one. There isn't any distortion, it's using only numbers, there's only two characters and they're always in the same location.

The easiest thing to do here is binarize the image and specify the exact character locations.

var cbl = new CBL({
    preprocess: function(img) {
        img.debugImage("debugPreprocessed");
        img.invert(1);
        img.debugImage("debugPreprocessed");
        img.binarize(200);
        img.debugImage("debugPreprocessed");
    },
    fixed_blob_locations: [
        { x1: 38, y1: 15, x2:  56, y2: 42 },
        { x1: 88, y1: 15, x2: 103, y2: 42 }
    ],
    character_set: "0123456789",
    pattern_width: 32,
    pattern_height: 32,
    allow_console_log: true,
    blob_console_debug: true,
    blob_debug: "debugSegmented"
});

This extracts everything perfectly.

image

When you train, save, and visualize the model, it's looking really clean, so you should be able to get 100% accuracy on this CAPTCHA.

image

skotz commented 4 years ago

And actually, if the source CAPTCHA is truly an SVG, then you might be able to literally parse the answer out of the SVG path data directly and not even worry about pixel data. But either way seems to work.

uitcode commented 4 years ago

And actually, if the source CAPTCHA is truly an SVG, then you might be able to literally parse the answer out of the SVG path data directly and not even worry about pixel data. But either way seems to work.

It worked well, thank you very much. I will think about how to separate each character in the SVG code you mentioned to be more optimal