1) Is it possible to train the IPAdapter model without text? The training tutorial code expects the data input to have a list of {image_file, text}. But given that the IPAdapter is a visual adapter, why does it need a dataset with text for training, since the base SD model is already trained on <image, text> input.
2) I am currently using your pretrained IPAdapter (not the face model, just the base IP adapter) in inference mode for pose transfer. I am providing the upper body image of the person as input and the resulting face of the person looks somewhat like the input person, but it is not a exact replica. Will training the IPAdapter on a human face dataset like FFHQ or CelebFaces (without any text annotation) help in a better reconstruction of the source person?
Hi,
1) Is it possible to train the IPAdapter model without text? The training tutorial code expects the data input to have a list of {image_file, text}. But given that the IPAdapter is a visual adapter, why does it need a dataset with text for training, since the base SD model is already trained on <image, text> input.
2) I am currently using your pretrained IPAdapter (not the face model, just the base IP adapter) in inference mode for pose transfer. I am providing the upper body image of the person as input and the resulting face of the person looks somewhat like the input person, but it is not a exact replica. Will training the IPAdapter on a human face dataset like FFHQ or CelebFaces (without any text annotation) help in a better reconstruction of the source person?
Thank you.