Question about the Function transform

MaartenGr / Concept

Concept Modeling: Topic Modeling on Images and Text

https://maartengr.github.io/Concept/

MIT License

187 stars 16 forks source link

Question about the Function transform #7

Closed xinli2008 closed 2 years ago

xinli2008 commented 2 years ago

Thank you for your excellent job-:) I have a question when i read the code about function transform You say, given the images and image_embedding, and the return is Predictions:Concept predictions for each image But when i read the code of transform, the output is not the concept prediction for each image. can you explain it ?Thank you very much!

MaartenGr commented 2 years ago

The .transform function returns the predictions for each image. Take the last few lines of the .transform method as shown below:

https://github.com/MaartenGr/Concept/blob/d270607d6ea4d789a42d54880ab4a0c977bb69ce/concept/_model.py#L193-L195

With that, we create a lower dimensionality of the embeddings and feed those to the HDBSCAN model to cluster. The resulting clusters, predictions, are the concept prediction for each image.

xinli2008 commented 2 years ago

sorry to bother you again. I try to use the following code to find the best concept for each images： concept_model = ConceptModel() new_concepts = concept_model.transform(image_list) the error detail is : Traceback (most recent call last): File "_model.py", line 629, in new_concepts = concept_model.transform(image_list) File "_model.py", line 197, in transform umap_embeddings = self.umap_model.transform(imageembeddings) File "/home/lixin/enter/envs/PR-VIST/lib/python3.7/site-packages/umap/umap.py", line 2802, in transform if self._raw_data.shape[0] == 1: AttributeError: 'UMAP' object has no attribute '_raw_data' can you give me some useful advice to fix it ? thank you @MaartenGr

MaartenGr commented 2 years ago

Definitely not a bother! Although I am not familiar with the error, I would advise making sure you have the newest version of umap-learn installed. If that does not work out, creating a completely fresh environment and re-installing in theory should resolve your issue.

xinli2008 commented 2 years ago

Thank you for your nice advice, i will follow that instructions. And i have another to bother you, if you have some ideas, i would appreciate it if you share it with me. I have a sequences of images(prehaps 10 images), if i want to find the topic or theme(wedding, vocation etc.) of them do you have some ideas?

MaartenGr commented 2 years ago

To find the topics of a set of images, I would advise going through the README and simply replacing those images with the images that you have. Do note that you would want at least a few hundred images to get a good clustering going.

xinli2008 commented 2 years ago

Thank you for you advice and i have some ideas in my minds. I have another question, if i predict the topic of a set of images, but how can i evaluate the results? Because i found no images2topic dataset. Look forward to your reply.

MaartenGr commented 2 years ago

To my knowledge, there currently is not a dataset where you can find both images and topics as topic modeling is typically evaluated through coherence, which cannot easily be generalized to images. Since concept modeling is rather new I do not think there is a set of standard procedures yet for evaluating concepts.