Closed thiner closed 2 months ago
@yhcao6 Thanks for your answer. I'd like to summarize my study from the code, please correct me if misunderstood the logic.
<ImageHere>
is a fixed placeholder which separate image and text prompt. PIL.Image.open
method or a torch.Tensor
instance.Based on above summaries, I have a further question, does XComposer-VL supports multiple images as input? I think it's not supported currently, is it?
XComposer-VL supports multiple images as input, e.g., query = '<ImageHere> <ImageHere> balabala', img_path = ['a.jpg', 'b.jpg']
Kindly reopen this issue if you have any further questions.
<ImageHere>
a fixed placeholder in text prompt?