chflame163 / ComfyUI_LayerStyle

A set of nodes for ComfyUI that can composite layer and mask to achieve Photoshop like functionality.
MIT License
1.33k stars 74 forks source link

Bug of VQAPrompt node #290

Open pipi32167 opened 1 month ago

pipi32167 commented 1 month ago

Try to use VQAPromot node to generate prompt but got this error:

CLIPTextEncode
'list' object has no attribute 'replace'
image

My workflow:

image

I tried to review the code, and then I found that the node returns a string list, guessing it's for supporting multiple image inputs, but in the end, it wasn't processed into a single string:


    def vqa_prompt(self, image, vqa_model, question):
        answers = []
        [vqa_processor, vqa_model, device, precision, model_name] = vqa_model

        for img in image:
            _img = tensor2pil(img).convert("RGB")
            final_answer = question
            matches = re.findall(r'\{([^}]*)\}', question)

            for match in matches:
                if precision == 'fp16':
                    inputs = vqa_processor(_img, match, return_tensors="pt").to(device, torch.float16)
                else:
                    inputs = vqa_processor(_img, match, return_tensors="pt").to(device)
                out = vqa_model.generate(**inputs)
                match_answer = vqa_processor.decode(out[0], skip_special_tokens=True)
                log(f'{self.NODE_NAME} Q:"{match}", A:"{match_answer}"')
                final_answer = final_answer.replace("{" + match + "}", match_answer)
            answers.append(final_answer)

        log(f"{self.NODE_NAME} Processed.", message_type='finish')
        return (answers,)

My fix was simple:

    def vqa_prompt(self, image, vqa_model, question):
        answers = []
        // ...as before
        answers = '\n'.join(answers)
        return (answers,)

If this fix is fine, I can submit a PR.

chflame163 commented 1 month ago

Thanks for you report, I fixed it now.