Thanks for your effort in maintaining this HOI learning list, which is very helpful to keep us up-to-date on recent HOI researches. We have recently introduced a HOI generation work, please find if it is appropriate to be added into the list.
InteractDiffusion: Interaction-Control for Text-to-Image Diffusion Model (CVPR 2024)
homepage | paper | code
In this work, we explored a reversed task of HOI detection, i.e., HOI generation. We propose a pluggable interaction control model called InteractDiffusion that extends existing pre-trained T2I diffusion models to enable them being better conditioned on interactions. Specifically we tokenize the HOI information and learn their relationships via interaction embeddings. A conditioning self-attention layer is trained to map HOI tokens to visual tokens thereby conditioning the visual tokens better in existing T2I diffusion models. Our model attains the ability to control the interaction and location on existing T2I diffusion models which outperforms existing baselines by a large margin in HOI detection score as well as fidelity in FID and KID.
Hi,
Thanks for your effort in maintaining this HOI learning list, which is very helpful to keep us up-to-date on recent HOI researches. We have recently introduced a HOI generation work, please find if it is appropriate to be added into the list.
InteractDiffusion: Interaction-Control for Text-to-Image Diffusion Model (CVPR 2024) homepage | paper | code In this work, we explored a reversed task of HOI detection, i.e., HOI generation. We propose a pluggable interaction control model called InteractDiffusion that extends existing pre-trained T2I diffusion models to enable them being better conditioned on interactions. Specifically we tokenize the HOI information and learn their relationships via interaction embeddings. A conditioning self-attention layer is trained to map HOI tokens to visual tokens thereby conditioning the visual tokens better in existing T2I diffusion models. Our model attains the ability to control the interaction and location on existing T2I diffusion models which outperforms existing baselines by a large margin in HOI detection score as well as fidelity in FID and KID.
Thank you again.