Hello, I'm working on a project that does extensive WHOIS parsing on quite a large scale, so I can relate to your work. I noticed that you're documenting which gTLDs are supported by your project and even made a table of them. Some have comments about the captchas that many of the HTTP-based ones have. FWIW, I found it's relatively easy to "break" the captchas using just ImageMagick convert and tessaract. YMMV but it breaks 9 out of 10 for me. It's very hacky and unsophisticated as I'm not a computer-vision or image analysis guy- but here it goes:
Hello, I'm working on a project that does extensive WHOIS parsing on quite a large scale, so I can relate to your work. I noticed that you're documenting which gTLDs are supported by your project and even made a table of them. Some have comments about the captchas that many of the HTTP-based ones have. FWIW, I found it's relatively easy to "break" the captchas using just ImageMagick
convert
andtessaract
. YMMV but it breaks 9 out of 10 for me. It's very hacky and unsophisticated as I'm not a computer-vision or image analysis guy- but here it goes:Please close this issue when you see it
Hope it's helpful