Closed zzk88862 closed 8 months ago
Hi @zzk88862 Thanks for your concern. This is actually a common case. In generic visual prompt mode (cross image), you may need more than one image for visual prompting, since different images may have a large intra-class variation and you need to get this generic visual embedding through multiple visual examples from different images. For example, when using a different prompt image, it works better.
the result is not good, Did I do something wrong in my steps?