WithSecureLabs / captcha22

CAPTCHA22 is a toolset for building, and training, CAPTCHA cracking models using neural networks.
https://labs.f-secure.com/blog/releasing-the-captcha-cracken/
MIT License
326 stars 41 forks source link

You have '.png' hardcoded in the server. #22

Open kofoednielsen opened 3 years ago

kofoednielsen commented 3 years ago

If i run captcha22 client label --image-type jpg, and give the genreated zip file to the engine. I get this error

Traceback (most recent call last):
  File "/usr/local/bin/captcha22", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/site-packages/captcha22/__main__.py", line 278, in main
    args.func(args)
  File "/usr/local/lib/python3.6/site-packages/captcha22/lib/core/server.py", line 22, in server
    server.main()
  File "/usr/local/lib/python3.6/site-packages/captcha22/lib/server/captcha22.py", line 519, in main
    self.run_server()
  File "/usr/local/lib/python3.6/site-packages/captcha22/lib/server/captcha22.py", line 494, in run_server
    self.check_files()
  File "/usr/local/lib/python3.6/site-packages/captcha22/lib/server/captcha22.py", line 409, in check_files
    self.create_model(file)
  File "/usr/local/lib/python3.6/site-packages/captcha22/lib/server/captcha22.py", line 375, in create_model
    model = captcha(path, self.logger)
  File "/usr/local/lib/python3.6/site-packages/captcha22/lib/server/captcha22.py", line 55, in __init__
    self.get_image_size()
  File "/usr/local/lib/python3.6/site-packages/captcha22/lib/server/captcha22.py", line 61, in get_image_size
    img = cv2.imread(images[0])
IndexError: list index out of range

Likely because you have .png hardcoded here: https://github.com/FSecureLABS/captcha22/blob/f5e2662f64cbb4d2606982a1b1d0a449a081631f/captcha22/lib/server/captcha22.py#L60

TinusGreen commented 3 years ago

Hi kofoednielsen,

PNG is hardcoded since AOCR only supports PNG and not JPG.

The labelling script supports labelling JPG files, but they still need to be converted to PNG before uploading to the server. I haven't gotten around to adding this conversion function to the labeller script.

In the mean time, you can use something like this to convert all your images to PNG for uploading:

import glob
import cv2

img_dir = "<Add image dir>"
output_dir = "<Add output dir>"

imgs = glob.glob(img_dir + "/*.jpg")

for img in imgs:
    cv2.imwrite(img.replace(img_dir, output_dir + "/").replace("jpg","png"), cv2.imread(img))

This is usually the small script I use for conversion, but I'd like to integrate it into the labelling script to already convert images to PNG.

kofoednielsen commented 3 years ago

You can also do it with a simple imagemagick command: mogrify -format png *.jpg

Just wanted to let you know, as others might run into the same problem.

TinusGreen commented 3 years ago

Hi kofoednielsen,

Apologies for the late response but I've been busy at work.

Yes, command line tools can also be used to solve this problem. However I want to code a converter into the labelling script already.

I've reopened the issue and will close it once I get around to releasing the new version with the above update added.