mahmoodlab / CONCH

A vision-language foundation model for computational pathology - Nature Medicine
Other
186 stars 15 forks source link

Implementation of Image-to-Text (Captioning) #6

Open bryanwong17 opened 2 months ago

bryanwong17 commented 2 months ago

Hi, I was wondering if CONCH is able to directly convert an image to text? From the code, it seems like CONCH is only available for "image-to-text retrieval," meaning that given an image and several texts, it will check which text is most similar to the given image. However, in the paper, there is also an example of CONCH doing captioning and a comparison between predicted and corrected captions. If so, could you please provide the code for doing captioning? Thanks!

Weiqin-Zhao commented 1 week ago

I am also looking for this amazing function of this excellent work, hope the authors can release the corresponding code and weights in the future.