skotz / cbl-js

JavaScript CAPTCHA solving library
MIT License
155 stars 47 forks source link

Help to solve this captcha. #40

Closed lakudo closed 4 years ago

lakudo commented 4 years ago

Hi, I am trying to recognize this captcha. and came across a problem. captcha : img1 img2 img3 img4 img5 When run train.html,

       var cbl = new CBL({
          preprocess: function(img) {
            img.binarize(32);
            img.debugImage("debugPreprocessed");
            img.blur(1);
            img.debugImage("debugPreprocessed");
            img.binarize(255);
            img.debugImage("debugPreprocessed");
            img.colorRegions(40, true);
            img.debugImage("debugPreprocessed");
        },
        character_set: "0123456789",
        model_file: "model.txt",
        blob_min_pixels: 40,
        blob_max_pixels: 350,
        pattern_width: 24,
        pattern_height: 24,
        pattern_maintain_ratio: true,
        allow_console_log: true,
        perceptive_colorspace: true,
        blob_debug: "debugSegmented"
    });

The result of 16 sample training is : train

Then, update model.txt and solve.html file, it can't fully identify the captcha. Target captcha: img6 Result: solve

    var cbl = new CBL({
        preprocess: function(img) {
            img.binarize(100);
            img.blur(1);
            img.binarize(220);
            img.colorRegions(40);
        },
        /* Load the model we saved during training. */
        model_file: "model.txt",
        character_set: "0123456789",
        blob_min_pixels: 40,
        blob_max_pixels: 350,
        pattern_width: 24,
        pattern_height: 24,
        perceptive_colorspace: true,
        /* Define a method that fires immediately after successfully loading a saved model. */
        model_loaded: function() {
            // Don't enable the solve button until the model is loaded.
            document.getElementById('solve').style.display = "block";
        }
    });   

Is it because of Not enough sample training or other else's ?Could you help me to fix it? Thanks!

skotz commented 4 years ago

Looks like you did a good job segmenting. Are you using the same image preprocessing steps in both the trainer and solver? The two scripts you have above have slightly different values. The solver preprocess steps need to be exactly the same as the trainer steps used when the model was generated.

If that doesn't fix it, then one thing you could do is load your model.txt file and then do a cbl.visualizeModel("theIdOfSomeDiv") and see if it outputs all of the patters you're expecting.

lakudo commented 4 years ago

Looks like you did a good job segmenting. Are you using the same image preprocessing steps in both the trainer and solver? The two scripts you have above have slightly different values. The solver preprocess steps need to be exactly the same as the trainer steps used when the model was generated.

Sorry,just double check train file, It's a typo. Both are using the same image preprocessing steps preprocessing steps,

        preprocess: function(img) {
            img.binarize(100);
            img.debugImage("debugPreprocessed");
            img.blur(1);
            img.debugImage("debugPreprocessed");
            img.binarize(220);
            img.debugImage("debugPreprocessed");
            img.colorRegions(40, true);
            img.debugImage("debugPreprocessed");
        },
        character_set: "0123456789",
        model_file: "model.txt",
        blob_min_pixels: 40,
        blob_max_pixels: 350,
        pattern_width: 24,
        pattern_height: 24,
        pattern_maintain_ratio: true,
        allow_console_log: true,
        perceptive_colorspace: true,
        blob_debug: "debugSegmented"
    });

If that doesn't fix it, then one thing you could do is load your model.txt file and then do a cbl.visualizeModel("theIdOfSomeDiv") and see if it outputs all of the patters you're expecting.

A little bit confused, How to do a cbl.visualizeModel("theIdOfSomeDiv")? on page of solver ,run cbl.visualizeModel("theIdOfSomeDiv") in console ?

skotz commented 4 years ago

First you'll need to add a new div with an id to the page like <div id="test"></div> and then yes, run cbl.visualizeModel("test") in the console. That should output a bunch of the trained images to that div.

lakudo commented 4 years ago

cbl.visualizeModel("test")

First you'll need to add a new div with an id to the page like <div id="test"></div> and then yes, run cbl.visualizeModel("test") in the console. That should output a bunch of the trained images to that div.

Just add a div id and run cbl.visualizeModel("test"), it seems to be good. debug

<div id="test" class="main">
    <img id="captcha" src="6.jpg" />
    <br />
    <input type="text" id="solution">
    <br />
    <a href="javascript: void(0)" id="solve" onclick="solve()" style="display: none">Solve!</a>
</div>
<script>
    var cbl = new CBL({
        preprocess: function(img) {
            img.binarize(100);
            img.blur(1);
            img.binarize(220);
            img.colorRegions(40);

        },
        /* Load the model we saved during training. */
        model_file: "model.txt",
        character_set: "0123456789",
        blob_min_pixels: 40,
        blob_max_pixels: 350,
        pattern_width: 24,
        pattern_height: 24,
        //perceptive_colorspace: true,
        /* Define a method that fires immediately after successfully loading a saved model. */
        model_loaded: function() {
            // Don't enable the solve button until the model is loaded.
            document.getElementById('solve').style.display = "block";
        }
    });    

    var solve = function() {
        // Using the saved model, attempt to find a solution to a specific image.
        cbl.solve("captcha").done(function (solution) {
            // Upon finding a solution, fill the solution textbox with the answer.
            document.getElementById('solution').value = solution;
            alert(solution)
        });
    }
</script>
skotz commented 4 years ago

That looks pretty good. Maybe one of the samples is mislabeled. Try doing this:

cbl.condenseModel();
cbl.sortModel();
cbl.visualizeModel("test");
cbl.saveModel();

And then saving it as a new model file. This will combine similar patterns and hopefully help eliminate any one-off errors in classification.

lakudo commented 4 years ago

Try it again following your steps, update the new condensed model file which looks good,but it still don't work. condense debug2 I try some new captchas which incluld number "1", I found a wired incident that all "1" became to "3". debug3 debug4

skotz commented 4 years ago

Yeah that's really odd. To debug it further I'd need a copy of your model.

lakudo commented 4 years ago

You can get the model.txt file from the following links: condensed model.txt uncondensed model.txt

skotz commented 4 years ago

I compared the patterns with the segmented letters and noticed that you're scaling them during training but stretching them during solving.

image

If you add this line to your solver just like you have with your trainer, it works great.

pattern_maintain_ratio: true

image

Hopefully that'll do it!

lakudo commented 4 years ago

Yeah! I missed an attribution " pattern_maintain_ratio:true" in solving while set to "true"in training. It looks like the two scripts of train and solve should be fully the same, Much obliged.