Questions about multiple objects

          Thanks for your in-depth analysis. This is because, compared to the background, the sneaker feature is more similar to the cat feature.

If we give the one-shot image containing two objects, sneaker and cat, the reference sneaker feature would correctly match other sneakers with the highest scores in new images, rather than the cat. As shown in Figure 9 for video object segmentation.

Originally posted by @ZrrSkywalker in https://github.com/ZrrSkywalker/Personalize-SAM/issues/9#issuecomment-1540066531

Hello, I am also having troubles in multiple objects. To clearly describe my problems, I am going to use chinese, please excuse for this:

我发现在sam模型的接口中，可以使用(单点，单框)，（多点，单框）这些prompt组合，但(多点,多框)的组合在代码层面上存在问题。不知道是否我代码理解的问题，我也发现你的代码中有在回避类似的问题：
针对上述问题，我想请问，paper 4.2节中Experimental Details 提及的使用bbox的具体流程步骤，我想您应该是step by step，比如第一步先用点prompt生成mask，第二步refine过程中再使用bbox。

以上，十分期待您的回答

ZrrSkywalker / Personalize-SAM

Questions about multiple objects #24