Closed mhutter closed 13 years ago
Good point, and yes we discussed about that. We try to find a solution, but it's not that easy. :)
I messed around a lot with Tesseract, but could not reach a 33% success ratio. Maybe I did something wrong. But I think the current solution is quite ok. But if someone wants to do captcha recognition - go on :)
And you can learn Python very quickly, especially in case you know Ruby :)
Das sieht recht interessant aus. In Kombination mit etwas Bildbearbeitung könnte man das Captcha ev. knacken. http://code.google.com/p/pytesser/
The Idea, as far as I understood it:
BTW: Startet dabbling in an Python-Tut today... looks pretty easy/cool!
mhutter, exactly, that's what i did with commandline tools (imagemagick and tesseract). and i never really succeeded.
i'll probably try again next week with pytesser. might work.
what also could help is a training file for tesseract. but i never really understood how to make those, because i never invested enough time in it. could also help to solve the problem.
what about asciiart? so at least its not another window (not that cool with a tiling wm like aweseome).
luxflux, not an option if you want to send messages automatically ;-)
mhutter, if you want reliable auto-sent messages, e.g. server warnings, you should not rely on xtrazone and use a sms service instead.
luxflux, i tried it, it only works if the window is large enough. maybe we could make it an option, but shouldn't be the default imo.
gwrtheyrn I agree, but it would be a nice feature ;-)
gwrtheyrn, why does it need a big window? i mean how big is your console window? i use 139x42 for pyxtra...
or make it optional?
luxflux: 1 char != 1 pixel.. you need to scale in order to make the ascii captcha readable :)
petermanser: this was colums x lines, not pixels :) or the other way around :D
luxflux: this is a pretty large ascii-captcha, easily readable:
and this is the same captcha at a smaller size:
as you can see, as the screen/window size gets smaller, it's not readable anymore. especially if the captcha image is larger than the one i used as an example.
@boardend: hm?
If you can put the Asciiart „Image“ into a multid. char-array, it should be easy to find the columns, which are empty (filled with #). So you can separate the chars and let the user solve ony by one. This solves the problem with the width of the shell.
Was just an idea, but could be a workaround, if the creation of the captcha-window fails?!
But finally, OCR ftw :-)
Hm but keep in mind that sometimes letters in the Captcha-Image may be overlapping...
Ah, sounds like an interesting idea. But then it would be better to separate the letters before the ASCII-conversion.
I think either we should show the entire CAPTCHA as ASCII and simply require a sufficiently big resolution, or not do it at all.
But we'll give OCR another try next week.
Btw, http://en.wikipedia.org/wiki/Connected_component_labeling
@mhutter: They only overlap in about 5-10% of the cases. That's bearable. I think everything over a 33% success ratio is OK.
I think, the user should see when there are two chars at one?
I'm not sure if it's not better to spend this effort into automatic reading of the captcha. :)
automatically cracking captcha \o/, rewritten the login mechanism (closed by 5937edbcd9f265521810799256752f08ae3493fb)
works for me, thx!
awesome :)
Add automatic captcha verification so one can use this tool in automated jobs etc.
Possible 3rd-party tools: Tesseract-OCR
(Yes, I thought about contributing (provided I can spare some time) but I'm a Python-noob ;-) )