Closed c4shm4st3r closed 6 years ago
What you really need here is an erosion filter, but unfortunately I haven't implemented that yet. You can try to get by with repeated blurring and binarization since that'll erode away edges until only the thickest lines (the numbers) are left. It's a hack, but the only other option is to implement more image manipulation methods.
var cbl = new CBL({
preprocess: function(img) {
img.blur(5);
img.debugImage("debugPreprocessed");
img.binarize(45);
img.debugImage("debugPreprocessed");
img.blur(5);
img.debugImage("debugPreprocessed");
img.binarize(45);
img.debugImage("debugPreprocessed");
img.blur(5);
img.debugImage("debugPreprocessed");
img.binarize(45);
img.debugImage("debugPreprocessed");
img.blur(5);
img.debugImage("debugPreprocessed");
img.binarize(45);
img.debugImage("debugPreprocessed");
img.blur(5);
img.debugImage("debugPreprocessed");
img.binarize(45);
img.debugImage("debugPreprocessed");
img.blur(5);
img.debugImage("debugPreprocessed");
img.binarize(45);
img.debugImage("debugPreprocessed");
img.blur(5);
img.debugImage("debugPreprocessed");
img.binarize(45);
img.debugImage("debugPreprocessed");
img.colorRegions(40, true);
img.debugImage("debugPreprocessed");
},
character_set: "0123456789",
blob_min_pixels: 1000,
blob_max_pixels: 10000,
pattern_width: 128,
pattern_height: 128,
pattern_maintain_ratio: true,
allow_console_log: true,
perceptive_colorspace: true,
blob_debug: "debugSegmented",
model_loaded: function() {
document.getElementById("aSolve").style.display = "inline-block";
}
});
cbl.train("3.png");
cbl.train("2.png");
cbl.train("1.png");
damn, im learning quiet a lot with this library, thanks for the reply, i will play around with it.
just one more question, if the captcha character is not in the dataset, does it still detect similarities?
Imagine this, i have 30 different numbers 3's but the new one comes different from all of those, does it still detect similarities? how about inclinations?
If you train a model on several samples of every possible character (say, 5 of each digit), then new instances of that number (not used during training) should have a high likelihood of being correctly classified. Does that answer the question?
So... I added a convolution filter method, meaning now you can play around with erosion filters (Google "convolution kernels" to find common ones). If you get the latest from master, you can now do this:
var cbl = new CBL({
preprocess: function(img) {
img.debugImage("debugPreprocessed");
img.convolute([ [ 1, 1, 1, 1, 1, 1, 1],
[ 1, 1, 1, 1, 1, 1, 1],
[ 1, 1, 1, 1, 1, 1, 1],
[ 1, 1, 1, -48, 1, 1, 1],
[ 1, 1, 1, 1, 1, 1, 1],
[ 1, 1, 1, 1, 1, 1, 1],
[ 1, 1, 1, 1, 1, 1, 1] ]);
img.debugImage("debugPreprocessed");
img.cropRelative(0, 5, 0, 5);
img.debugImage("debugPreprocessed");
img.colorRegions(2, true);
img.debugImage("debugPreprocessed");
},
character_set: "0123456789",
blob_min_pixels: 2000,
blob_max_pixels: 10000,
pattern_width: 32,
pattern_height: 32,
pattern_maintain_ratio: true,
allow_console_log: true,
perceptive_colorspace: true,
blob_debug: "debugSegmented",
model_loaded: function() {
document.getElementById("aSolve").style.display = "inline-block";
}
});
Which gives really good results, if I do say so myself.
Here's how it segments the three samples you had.
Thanks a lot for the support, im now trying to use a bigger dataset.
I will try, thanks <3 <3
I updated the library itself to add the new convolute
command, so you'll need to download the latest version for that last snippet to work.
hello, i have a harder challange, how would i use the library to presegment and actually solve this single character captcha?
It is a single number from 1-9
I have a working method but i'd like to do with this method just for curiosity of how to get it working with javascript.
Those images are just a small portion of the big dataset as an example. How would i get rid of the trashy background ?
Thanks in the future