ad-m / python-anticaptcha

Client library for solve captchas with Anticaptcha.com support.
http://python-anticaptcha.readthedocs.io/en/latest/
MIT License
219 stars 51 forks source link

Creating an ImageToTextTask with from a string containing base64 data #75

Closed dpellegr closed 4 years ago

dpellegr commented 4 years ago

Hi, given a string encoding a png image in base64 (example given below), what is the most straightforward way to create the corresponding ImageToTextTask?

For a quick check, there exists many website which render base64 data, for instance: https://codebeautify.org/base64-to-image-converter# just copy-paste everything in between the "quotes" ;-)

"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAFsAAAAYCAYAAACV+oFbAAANZklEQVR42uVZe2xV9R3/vc65t71tobSFUilgwxsBAQVEFJHoMqcJUbapS3BZFt3mHpmLf8zg3JLN+Vg02bIsvrKZmGVziw9chhsqDMZDBspTEaS0hZbSQt+Pe885v99vn+/v3IuXWkqrkmUbyem53HvO+X2+39/n830d3lTx4Fqumce19riJfGmYx3TkSW19ZiKPW6s0lwFTIrRCBFbIkAn6LEMreWClwhnfcx5apUKDsxEqsJ7AmQdcilDjd85kaPA7VyyM8DzOWMiFF4TS4K/1LGOetdqXUeixyPiCcU/ojCeM9W1ocKZD+9waj0e43hqfAycHTsIuDeE3wG7d9/G1hF2EMXbYEON2NmjJQy6BE2fLYYuSAWGHjVhNBQZnDvuMlXRdqJUNuScDuCpk3ARRIhEy/FOZjMes8CULPRsSfu5ZWp8BNzA53GHkEzYFp0qujeRMS2YYDsCwVhpjpLBGWia0EEJrnJnAtgimNc50lzsYDs415/HZclyn4rPAdVpI+BqucM+xWhtPS1/qENdKqyycK7VgMBvmsEgKw6XFooSHwwsGmAQ3DhPD9wZnhUPoSJ5FQdgZYcdnq92ZncXOtCHsDNtAuAmTJDvIpixmmcMvtFU48BCBz7gZ8IEZLACBNGBoAVeGwtMqghsZyCC4ZEEk8SypgN1gfSGM8ylh4FGMnQ5lAJIzK+VZoDCOnI4LaecMOQ+A6bBMurMgkASexUANIwMEAMdgDR5EgMEgZ4ABAqMswHrwtdVYQyt4KJSB9AHShHCygO9BeahLMg49EAHsR0DdZwZMWNEw6zBzhx36MYQ/xuywM7qWGCCcU+HgiLAw53RhBGEnMpwlC1bDTc7Z7qDNiG0A52ATCAOnR5xsAHYWY/cChmVBDjicSOMICuxuO4kcUc6GGHvMbGKCA+vAu5vwH4mFI8dqt3jsaM3oDjrIbBWzGkzGInAPHImzcAyi+8AEYgWuwxM1HATQng5gCLaGQXJ4UES/ONACjjbkaKiLMHCnLka8AxwCj5XJ8WdZ/Z9T5H19j97WazNjFdQnsJNgiCiTRV0LdU3dF+2ixkEViQ0VjgkEFk4XkXaGWfwxWSYzLmIJAmDMhiyrWWw2MSbHCFICGWCIUXD04+GGpf08o3S/jYgZEU7wX3paovrY14qurbcU1UUoEZclSdKQRuAw64wASyIdM4O2EgQwOp/VI1fkvb07lzWb/hJsLhyJDcDDy2Th6SWFVfvvGjP74HAU2ShOJOvsyRuTNtFWzopPYjkeiMyog7ZtwX5Zl/AN++2qaGHzQEUqcrBjNovlR2Elx2piBOyNmQFgcKDJhQ5iNTlVOFaQBM+yIorDiNQnbZe3zzR8TjGvxWey08A+DW9kdFhRr0/ddjBdu+2J0Xe8YsBwSfEOW4elsuyMmfyyd2jqeFMQLAsr2+g75+gcP0eoyE7w4M2gaXEJ99srZeoEcNsemyk5km6fv6X/+M0Btz++u2L+gQsp8qX+rVPg+cJ5oualB9lXdosooq3wnmN/m7eOv71mj6ifssouaB2oSGJ2nFjAbPqSZEuspzgcszqbWACWWE1sEGByxOKwwZnUZ2MdDGQ5ZgDsRv3epaAHvyVx+e9Xj1p8JEK2FywRah6ptWee/1JT1Hb9B5lTb01n5eB3JJgLETxOLPj8x8T+WdtUwyIwXFyRHvdaMjIqTnxEiJEr8o1M8/iM1cnPF1ZveaD0yu2IWAFVHz/v3DXn+Y6D39nWf2L218XCvYQdK4BccLKh2sLT1sVqX0ciUnXsxBTKCcv5zGOkvJwiLQsTICCfZMrP8Fy8zlOkSy7EDCdDl1gMCgIWxy+RY3Uc9yh7c4rTMma04Cob73g2i8cOtwYMBzuOmJYqYLLLi2YdjyXp6UgZQclkLB/dQk7rNb0FVIGYgJIjZXPjqpAX5J45/1KN8x7pXvHyrf1T3v51atdVMTGMzOWZgYok1jMeYx2oSMK+XbdeAsWzFX71UcLPsorsN66aYdV+ST3dR0olR8fMVi45agonlMGBvdm2zfCZ6F3IZ3bE2LX8C9t56Ubx/srJpvz9O9NLP8xP7sRqUpwaSbnUJvvUKdXvT+Bl7QVCXbBcajf9FR4TZypkcTrE9fnlUnPUPlmCWVV8TMaYc8ulF8X+ma+r2pvuysxel0JMui6oPpXwuCez8dph/QSKPBi1T0py2Xt5qqLVZvPM812HJq/vqV010SvZcX/lNduJ1Uh3YLB0rGY+5RkOX8esTss+rzvTN4Ohjv5q+PhDsAXITTKyuqjMFp24J7r+rSR5BsW3Iw6jMjVWpBpJufRiYteszf4Hix4Kb3tmKqvuvVC51M37Kkt56gixOlcuNYSni/7QtWVps22/qkZUbBtnU4gu3MuVS8+IXQsPy9MzHu1d/uwjqe2rl6Qr15Wiz7o6Pe40Sj6fYrVkceg4V5FWxA3M4IrM4Ol1uvvSCI3OtU0vr6VwE1qTyJCTVMGR749b/LsU8wKEFhDGxmxmHkKz0Sbhg3yotDOh3GT3jEfuSU61VZsnsvJm+Fp5+KnH9o3ey+uXPKxeXfMEv+OFiaYkEjpmts0qUo2kXDqqTo0fxQpap+qqdquGLpeORCcLMyYqbWZdi+4888srwBjsHxe0oQq0myBLd//Qv+nv3CW3uIGptZ0l//TrVzzXc8tTSay4Oj1t629S7179YNuCHZ+2gdkRtIwO0HUuSVT+Y7Jf0hAZEyH56BbTV/ROf8sNDzRt+snTNQXfm+tNOjNUA/N2/3uzKS7fya/bsjia1o1HoNM16BaN9yv5un7TO/CFferYuAnR7GayjYKuyCpSDbeBCfFVs+yqnKUn7B1OA7Mp/f4kMFFMkmWvpWQhxWeeZL5XIhPBSjn9xFRWGSDNejavgakxxelZpuzgn/wDs9dEM49c31fdvLnk+Lw9fmv5vP7ynk/TwKwPjteQjd8addnGRcmq5vyRwndPvtG5sevYPRs66qvnFNe0DNXANPH2Go+p7kXRtM58RVIDg80WtMYYUxhS72BdLjFxmUr123AbmA3Jw1UhosEcW107nAamwbZVwr/RveU3r7/ElGVQ1HvMRr4MEROY9jnKpcEamB/0Ldt6X+Ff16yJZtRSYrm/c+62XxTvXbGgp3Trp2lgDkWd1QVcdS1OVrWYAYoEKQw5qdQv7rrQSKHT9tSU8eJ6PshIYaf68IoyU9i6IJrYwQcZKagY7IUbmP2ioUpyEdxgFtS6imOQBia/XGpjfRU+V81VdlyaEotHCWwYDUzSGv5YxzV/zpVLxcw3V2bKG54uPTT3njNTDq8vbpi4rGtsW6kL08MfKRzXPROrVKrODqLIXT1NN6WEf/TWitlHjVHnHSlslXsqAhOOTVmv7kW5ZZoXWRUKnWwR7WP3yvrLe204+hvpZesSIbgzyEhB8WE2MPXizCVjWXGjQvFjBmlg4BxjXbnkaUoy3aZvymhRvMuVS4CMh2JnsQ9DNDC5cqkEIZAmdrkGZlXnpBNPjtlf9VTp4bnNqq/0xo7KtpGMFHaGraM6TVA+ySv54LmefVPQI0TdRvMPo/aKd/ubl6dtOPb2srlri7xUYIYYKbxl91+GKoQdY62L6/mmqxiSD4WwYpbsHK9Hnfp238o3lqSru0Q8kfzYSEENp4E5IJuKTsvu8mVm+qbzNTD55dIGu7scISdVwUcdRasiPRrWhK77oavkYA1M3Nayj5VLuQZmjFaZ/V7HxMca521HW6RGMlJ4JV03jRSwL3P6ajpY9l+h8E6OkQWHvzl+0WNfHjf/CB+kgclX5I/C23cA3G5DI10csN4XNNoNtS/c6Bc5iFnvfCMFNZwG5h1VNx4byhfoKbX5DYxTAzUw8txyaSWf37S8asndZJDMZDxDgxoXPrQboYJUKNdYPI7Mn3/YbObOK5eIn6+mGmpqVe/Yn56cs82jWD3CkcLDY5Zs/lnZ0jfRTgVGCTfHRrhDMEuEYDLNu9GWKKdIsiGET9B+jViRDv8QIwU1nHnve6JxYsLKnivZzKaLOe81kOvAcolKu/mZUW03d1S2+ujjPuuRwlANzGehyHNGCoYWHaRc4tl5LyWaRnG6upKVNTTy9sRHoPMaGBkzggxzDQxkGIlAyog5FlPv7santCjNe02WyQPmvRQXCbDJtt8mm8Unp1OQBxcXY6Rgsop0VdSAkYJToIgc9sFGCh/hjzMEVU9DjRQoZg85792hDlcETBfUiVMz7xfPTqfmxNdeB+5AtlZd8+WM1+/2V+/8f34Dcz5FDhwpqAu9gVnEZjY+Hk54si3Ry5tYe6KXZViL7Ez2spC3m87iOYlpx4Yql/4f3sCIQRqYwUYKajhvYCrV6PQ4Wx7OlJOzL0u9AE5BglFh5Mkhy6X/tjcwF1ORIr9cEixmt8hl8wFvYBxTsg0MMSrXwFC808QMQ4kFiQGHyMY793YSDOGOnXGsE/nxLm/ea1hcaZisDPPfwIi8Bsac08DwLOYsdtrC7DvRgQ2Mw54bKVBPQIqkcOKwZ/OMwx5lY3U2vziixHjFOYnROOfGuHPYkU9ysTqrSJFVpPokb2AGNjAXs1z6X3on+m+3uSLIg8eQzAAAAABJRU5ErkJggg=="

ad-m commented 4 years ago

See https://github.com/ad-m/python-anticaptcha/blob/master/examples/text.py for Example. You don't have encode anything by hand.

dpellegr commented 4 years ago

There is no encoding by hand: I get that string from the source of a webpage. Contrary to the example I do not have a file to load from the disk (unless I write the data to the disk, but that is silly).

dpellegr commented 4 years ago

I had a look into the source. I think that the easiest way is to add class StringToTextTask which does not redo the encoding into the serialize function. Would you accept a pull request?

ad-m commented 4 years ago

I can accept PR, which will not force coding on the library side.

You can for Python 3 use also something like:

from io import BytesIO
import base64
fp = BytesIO(base64.b64decode(url.split(';')[1]))
dpellegr commented 4 years ago

Fair enough, BytesIO(base64.b64decode(...)) did the trick without extra effort. Thank you!