3dlg-hcvc / multion-challenge

Starter code and instructions for participating in MultiON Challenge 2021.
MIT License
10 stars 1 forks source link

Few queries and potential issues #4

Open sai-prasanna opened 1 year ago

sai-prasanna commented 1 year ago

Hi, I have few issues and queries. Some issues might be because of one approach we're looking into, at my lab. It's semantic segmentation -> BEV maps -> RL. Some issues might affect other approaches as well.

  1. Can you share the dataset generation script? It would be useful for my research on multi object search. We use different definition of the problem where the objects are unordered.

  2. The robot height might be insufficient to capture object placement. I think the current height of the robot is 0.8, Have you tested if it can look at all the goals? Since many goals are placed on counters/shelves etc, some aren't visible with this height.

  3. I think many scenes have goals placed in multiple floors, is this intended? This also makes semantic mapping into 2D bit hard I guess.

  4. Often objects are placed in weird positions/angles, protrude into the mesh. I am not sure if it's because I use a new version of habitat or it's just the problem with meshes in the dataset making object placements not perfect. Some object are scaled really to unrealistic sizes.

  5. I think zero-shot objects in test, while a good goal, makes the task not possible for semantic mapping based approaches. The object textures, object scale, object placement, sim2real gap of habitat in general would make off the shelf semantic segmentation models hard to use. So, you have to train/fine-tune semantic seg models on the habitat scenes with objects, I am not sure if the validation/test zero-shot can be used for training semantic segmentation model (not the navigation model of course).

tommyz94 commented 1 year ago

Hi Sai, Yes, the robot’s height is ~0.85m. We sampled only surfaces with a maximum height of 1.5m (wrt the ground). Ideally, the objects should be visible by the agent, but there can be some corner cases where the object is not easily visible. There are some multi-floor episodes, but they should be the minority. The episodes are automatically generated, so there can be episodes where the objects are placed in the middle of clutter. To place the objects, we selected the surfaces and computed the normals for the points representing the surface. We chose as candidate goal positions the points on the surface with the normal pointing upward. For the objects with unnatural sizes, thank you for pointing this out. I’ll look into the objects. Zero-shot objects: nowadays, there are many detectors/models available for zero-shot instance segmentation (e.g., detic, CLIPSeg). With this challenge, we want to push also this aspect because a robot in the real world should be able to find not only a closed set but an open set of objects. For the DS generation code, I can’t share it while the challenge is live. There are details that makes the task easier (e.g., the possibility to recreate a train-set with the “zero-shot” objects). Anyway, I will be happy to share it with you when the challenge ends!

sai-prasanna commented 1 year ago

@tommyz94 Hi, thanks for your reply. Since the challenge deadline has passed, can you share the dataset generation code? I haven't made a submission.

I can share my email if you don't want to share it here. I am modifying the current challenge for my research, and having the dataset generator would be really helpful.