jianjieluo / SCD-Net

[CVPR23] A cascaded diffusion captioning model with a novel semantic-conditional diffusion process that upgrades conventional diffusion model with additional semantic prior.
https://arxiv.org/abs/2212.03099
Other
57 stars 5 forks source link

How to use custom dataset? #11

Closed mbrz97 closed 4 months ago

mbrz97 commented 5 months ago

Hi!

Not really an issue, but I have a question about using custom datasets to train this model for a medical image captioning task, specifically for radiograph images. From what I understand, I need to extract image features from my dataset using the bottom-up-attention model. Can you please confirm if this is correct?

Additionally, is there an easier way to train the SCD-Net model on a custom dataset? Any assistance you can provide would be greatly appreciated.

Best regards, A desperate student