xinyu1205 / recognize-anything

Open-source and strong foundation image recognition models.
https://recognize-anything.github.io/
Apache License 2.0
2.78k stars 271 forks source link

Some questions about fine-tuning recognize-anything model #174

Open weijiafs opened 5 months ago

weijiafs commented 5 months ago

Hello

I want to fine tune the recognize-anything model to label images with tags for real people or cartoon characters. I have two questions:

  1. Would fine tune just the ram++ be enough, or do I also need to work on the text2tag part?

  2. Also, I'm not sure how to go about this step. Could you please provide a detailed explanation?

    Prepare pretained Swin-Transformer, and set 'ckpt' in ram/configs/swin.

thanks.

adbmdp commented 5 months ago

You can find some answers here: https://github.com/xinyu1205/recognize-anything/issues/173

I think you don't need the step "Prepare pretained Swin-Transformer". You just need to fine-tune the model. No need for steps 1 to 5.

I'm also trying to train the model. It is not an easy task!