CaptchaAgent / hcaptcha-model-factory

🏗 hCaptcha image label binary model factory (PyTorch Training, Cluster-based Auto Label Tools, Export ONNX model, ONNX model inference)
GNU General Public License v3.0
70 stars 17 forks source link

How do I help get new captchas solved quicker #53

Open Littlesnaegg opened 1 year ago

Littlesnaegg commented 1 year ago

Hello, im using your program and it works great!

I want to help you update your program more frequently. I already installed the hcaptcha-model-factory, but when I want to train it, so I can upload the files for you, it needs more pictures. Can you give me a step by step guide on how you update your program ? Maybe a video would be nice so that I can help you get updates quicker !

Thank you!

beiyuouo commented 1 year ago

Sorry I can't understand what you mean. Especially the sentence "I already installed the hcaptcha-model-factory, but when I want to train it, so I can upload the files for you, it needs more pictures."

Littlesnaegg commented 1 year ago

Well I would like to train your program to solve the new captchas, but im having trouble because the program wants more pictures I suppose. Could you maybe make a tutorial video or text on how you would go about adding a new captcha to the program to solve ? I use the Hcaptcha-Challenger program and would like to contribute to its already existing assets, so that the new captchas can also be solved.

beiyuouo commented 1 year ago

You can refer to the tutorials in the wiki. For the issue of training with few images, we will try using the few-shot learning method, but this is still WIP.

QIN2DIM commented 1 year ago

collect datasets

https://github.com/captcha-challenger/hcaptcha-whistleblower

Littlesnaegg commented 1 year ago

Im not quite sure how to use hcaptcha-whistleblower, or what it is even used for. As for the wiki, ive done all the steps, but when Im asked to provide the pictures I dont know where to get them from.

So basically this is what I do :

  1. run the main.py
  2. put in the lable name
  3. folder gets created

---now im not sure what to do---

  1. Provide pictures
  2. run train

If you could provide me with a way to do steps 4 and 5 Id be grateful!

DonJ0n commented 1 year ago

Im not quite sure how to use hcaptcha-whistleblower, or what it is even used for. As for the wiki, ive done all the steps, but when Im asked to provide the pictures I dont know where to get them from.

So basically this is what I do :

  1. run the main.py
  2. put in the lable name
  3. folder gets created

---now im not sure what to do---

  1. Provide pictures
  2. run train

If you could provide me with a way to do steps 4 and 5 Id be grateful!

your best bet would be what I have been doing since this morning, https://github.com/QIN2DIM/hcaptcha-challenger run the captcha challenger with the PARAM screenshot and then you will have a dataset in the data's folder, grab the images from there since they are the actual images the challenge uses and train your models based on those. I noticed there was not more than a few items and if you manage to train a few models you will be able to bypass it fully automatically.

EDIT: im facing the same issue as you :3

FileNotFoundError: The structure of the dataset is incomplete | dir=C:\Users\LB\Desktop\Dis\hcaptcha-model-factory-main\data\chair