xinyu1205 / recognize-anything

Open-source and strong foundation image recognition models.
https://recognize-anything.github.io/
Apache License 2.0
2.78k stars 271 forks source link

how to form a ram_plus_tag_embedding_class_4585_des_51.pth for my own data. #188

Open DWHNicholas opened 3 months ago

DWHNicholas commented 3 months ago

image In the fourth step of model training preparation, you need to download aram_plus_tag_embedding_class_4585_des_51.pthfile. Looking at the code, this file seems to be used for tag embedding. I read it in the code and saw that the shape is ([233835, 512]). I would like to know how this file is formed and how to form such a file when preparing my own data set for processing.

DWHNicholas commented 3 months ago

I already know that theram_tag_list_4585_llm_tag_descriptions.jsontext is *233835 based on 458551**, which should be the embedding of the description corresponding to the tag. Is there any other document that describes how to generate this embedding or what model is used to generate the embedding?

DWHNicholas commented 3 months ago

I think I already know how to do it

ntlm1686 commented 1 month ago

@DWHNicholas Could you please explain briefly how?