Closed snowfluke closed 2 years ago
@snowfluke
I think the memory cost is caused by https://github.com/JackonYang/captcha-tensorflow/blob/master/datasets/gen_captcha.py#L35 and L43, code pattern: list(itertools.permutations(x, y))
you can write a better version to generate the samples.
sorry for the delayed relay.
I tweaked a bit
image = ImageCaptcha(width=width, height=height)
remain_count = max_images_count
print('generating %s epoches of captchas in %s.' % (n, img_dir))
choices = get_choices()
char_sets = ''.join(choices)
for _ in range(n):
for i in range(remain_count):
if i % 10000 == 0:
print('(%s / %s)' % (i, remain_count))
captcha = ''.join(random.sample(char_sets, num_per_image))
fn = os.path.join(img_dir, '%s_%s.png' % (captcha, uuid.uuid4()))
image.write(captcha, fn)
if n < 20:
print('(%s/%s) epoches finished' % (_+1, n))
Now I had another issue, how can we modify the model to be able to handle 4-6 letters in just 1 model?
@snowfluke
how can we modify the model to be able to handle 4-6 letters in just 1 model?
the model in this repo does not support by default.
there are two directions to solve it:
@JackonYang
I see, I think I'm gonna go with the first choice since I don't want the pre-trained model going to be wasted. Thank you for the reference. One more thing, can we do the third option which is to evaluate how many letters in the given sample then we have 3 models with 4, 5 and 6 letters conditionally solve the sample? I want to know if we can detect how many letters in a captcha
@snowfluke yes, we can.
you are proposing a 4-model solution in the 3rd option. captcha number classification, 4-letters, 5-letters, and 6-letters.
actually, the 1st choice is an optimized version of the 3rd one. the models share CNN layers, and the fully connected layer is a combination of the other 4 models.
Alright, thank you very much. I'll go with 1st choice and using placeholder @
I tried to generate datasets with 6 letters as follow:
But it turns that the process in google colab exited automatically, because of the usage of the RAM. Is there any way to limit the RAM usage?