jiuntian / interactdiffusion

[CVPR 2024] Official repo for "InteractDiffusion: Interaction-Control for Text-to-Image Diffusion Model".
https://jiuntian.github.io/interactdiffusion/
94 stars 9 forks source link

Custom Dataset #11

Open Hayeon-kimm opened 3 months ago

Hayeon-kimm commented 3 months ago

Thank you for sharing your research in this code. Thanks to you, I am studying a lot. I want to conduct an experiment on custom dataset. If you look at the HICO_DET_CLIP you share, I think I need 'action' image_embedding / text_embedding as well. To use it on custom dataset, can I get the action bbox, cut it, and pass it through clip embedding correspond to the action image embedding you provided?

jiuntian commented 3 months ago

Yes, I think you can obtain the clip embeddings in such way. In InteractDiffusion, only text_embedding is used.

Hayeon-kimm commented 3 months ago

If you okay, can you share your process_grounding.py code for HICO-DET? To get embedding for custom data, I make the loader for this. But, your hico-det-clip and gligen.tsv is little different. So, I want to show your preprocess step for HICO-DET.

jiuntian commented 3 months ago

You may refer to extract_embedding.py.zip.