I have a question about the prompt when generating a chest image.

Thanks for the great research!!

“Brain imaging generation with latent diffusion models.” According to the paper, when the diffusion model generates brain images, it receives conditioning variables(age, sex, ventricular volume, brain volume) and generates them.

On the other hand, when generating a chest image, how do you convert prompts into conditioning variables?

Looking at inference.json, it seems to use pre-trained CLIPTokenizer and CLIPTextModel.

When generating a chest image, if I enter the prompt as, for example, "Big right-sided pleural effusion", will the pre-trained CLIPTokenizer and CLIPTextModel convert the prompt into conditioning variables?

Is it correct to use conditioning variables? If it is correct to use conditioning variables, can I know what conditioning variables are used when generating chest images?

Thank you.

Project-MONAI / GenerativeModels

I have a question about the prompt when generating a chest image. #456