JingyuanYY / EmoGen

This is the official implementation of 2024 CVPR paper "EmoGen: Emotional Image Content Generation with Text-to-Image Diffusion Models".
47 stars 7 forks source link

Emotion transfer Metrics #9

Open hustzyj opened 2 months ago

hustzyj commented 2 months ago

Hi, I want to know how to calculate the metrics of emotion transfer task in this paper, especially CLIP-img and CLIP-txt. I didn't find the calculation method in the related literature.

fengjw0909 commented 3 weeks ago

Apologies for the oversight. In the text, we did not provide a detailed description. CLIP-txt involves using the CLIP text encoder and image encoder to separately encode the input emotionless prompt (e.g., "skirt") and the generated emotion image, after which the cosine similarity is computed. CLIP-img, on the other hand, calculates the cosine similarity of features between images generated by the model using both the emotionless prompt ("skirt") and the emotional prompt (" skirt").