JackonYang / captcha-tensorflow

Image Captcha Solving Using TensorFlow and CNN Model. Accuracy 90%+
MIT License
996 stars 272 forks source link

Forced exit task while generating the dataset #41

Closed snowfluke closed 2 years ago

snowfluke commented 2 years ago

I tried to generate datasets with 6 letters as follow:

!python3 datasets/gen_captcha.py -dul --npi 6 -n 1 -c 10000 --data_dir ./gdrive/MyDrive/Teledrop-Datasets/

But it turns that the process in google colab exited automatically, because of the usage of the RAM. Is there any way to limit the RAM usage?

JackonYang commented 2 years ago

@snowfluke

I think the memory cost is caused by https://github.com/JackonYang/captcha-tensorflow/blob/master/datasets/gen_captcha.py#L35 and L43, code pattern: list(itertools.permutations(x, y))

you can write a better version to generate the samples.

sorry for the delayed relay.

snowfluke commented 2 years ago

I tweaked a bit

    image = ImageCaptcha(width=width, height=height)

    remain_count = max_images_count
    print('generating %s epoches of captchas in %s.' % (n, img_dir))
    choices = get_choices()
    char_sets = ''.join(choices)

    for _ in range(n):
        for i in range(remain_count):
          if i % 10000 == 0:
              print('(%s / %s)' % (i, remain_count))   

          captcha = ''.join(random.sample(char_sets, num_per_image))
          fn = os.path.join(img_dir, '%s_%s.png' % (captcha, uuid.uuid4()))
          image.write(captcha, fn)
        if n < 20:
            print('(%s/%s) epoches finished' % (_+1, n))

Now I had another issue, how can we modify the model to be able to handle 4-6 letters in just 1 model?

JackonYang commented 2 years ago

@snowfluke

how can we modify the model to be able to handle 4-6 letters in just 1 model?

the model in this repo does not support by default.

there are two directions to solve it:

  1. the model is trained to support 6 letters. for the images less than 6 digits, using placeholder characters X to align the labels. remove the placeholder characters when using the model in the inference process.
  2. use LSTM model instead CNN model. there is an example in tensorflow.org. https://www.tensorflow.org/guide/keras/rnn#using_cudnn_kernels_when_available
snowfluke commented 2 years ago

@JackonYang

I see, I think I'm gonna go with the first choice since I don't want the pre-trained model going to be wasted. Thank you for the reference. One more thing, can we do the third option which is to evaluate how many letters in the given sample then we have 3 models with 4, 5 and 6 letters conditionally solve the sample? I want to know if we can detect how many letters in a captcha

JackonYang commented 2 years ago

@snowfluke yes, we can.

you are proposing a 4-model solution in the 3rd option. captcha number classification, 4-letters, 5-letters, and 6-letters.

actually, the 1st choice is an optimized version of the 3rd one. the models share CNN layers, and the fully connected layer is a combination of the other 4 models.

snowfluke commented 2 years ago

Alright, thank you very much. I'll go with 1st choice and using placeholder @