Some questions about fine-tuning recognize-anything model

xinyu1205 / recognize-anything

Open-source and strong foundation image recognition models.

https://recognize-anything.github.io/

Apache License 2.0

2.78k stars 271 forks source link

Open weijiafs opened 5 months ago

weijiafs commented 5 months ago

Hello

I want to fine tune the recognize-anything model to label images with tags for real people or cartoon characters. I have two questions:

Would fine tune just the ram++ be enough, or do I also need to work on the text2tag part?
Also, I'm not sure how to go about this step. Could you please provide a detailed explanation?

Prepare pretained Swin-Transformer, and set 'ckpt' in ram/configs/swin.

thanks.

adbmdp commented 5 months ago

I think you don't need the step "Prepare pretained Swin-Transformer". You just need to fine-tune the model. No need for steps 1 to 5.

I'm also trying to train the model. It is not an easy task!