For inference of paper https://arxiv.org/pdf/2211.09800.pdf, It seems C_I and C_T as two conditions and there should be three times inference and two guidences scales. You still use C_T not use C_I as condition for inference. Have you try the same way as InstructPix2Pix shows?
Hey! Sorry for late reply, funny you mention this as our IP-Adapter-Instruct actually used a novel way that modified the cfg to take 3 conditions, should be detailed more in the camera ready paper.
For inference of paper https://arxiv.org/pdf/2211.09800.pdf, It seems C_I and C_T as two conditions and there should be three times inference and two guidences scales. You still use C_T not use C_I as condition for inference. Have you try the same way as InstructPix2Pix shows?