carla-simulator / carla

Open-source simulator for autonomous driving research.
http://carla.org
MIT License
11.09k stars 3.57k forks source link

Non-player agents info for objects within the FoV of the camera only #63

Closed suvam-bag closed 4 years ago

suvam-bag commented 6 years ago

Is there a way to get the ground truth of the objects only within the FoV of the camera?

felipecode commented 6 years ago

Hello @suvam-bag ! At this moment we don't have such function. However, i think this could be quite useful ! From the client side, It could be implemented from the data we already have. The client receives the camera pose , FoV and the positions of all the objects present in the scene. With this, one could compute a function that projects the camera view on the 3D space and get all the visible objects.

This is also possible to be done from server side. The rendering engine already needs to get the visible objects, so this should be low cost. One could investigate how to do that with unreal.

Currently we are working on other things. Help from the community would be useful!

suvam-bag commented 6 years ago

Thanks @felipecode for the detailed answer! I will try to create a mapping function between the image pixel and the position of the ego-vehicle in the client side.

YashBansod commented 6 years ago

@suvam-bag. Were you able to create the mapping? If you did, could you help me with it?

YashBansod commented 6 years ago

@felipecode am I correct in understanding that the code to generate semantic segmentation images is written by the Carla team. If so, is there some way for a simple tweak in that algorithm to create instance segmentation?

I basically want to test a Faster-RCNN for object detection in Carla. Now to get the ground truth bounding boxes I can have 2 approaches:

  1. Somehow using the given camera data and the 3D position, orientation and size of the object, I project the bounding box onto my camera view. But this requires some knowledge of perspective mapping and other CV topics. Still, it wont really work well since the Z axis length of the bounding box is not correct in the given data.

  2. Use the Semantic Segmentation to create a bounding box for each type of class in the scene. This is really easy to implement where your topX=minX , topY = minY, W=(maxX - minX) and H=(maxY - minY) but you should have only one instance of the class in the scene for this kind of bounding box creation to be done. This would work easily if we were provided with instance segmented images instead.

In the second scenario,even with the SemSeg images if objects were far away, it would be possible to create the ground truth bounding boxes by just checking if the two blobs were connected (Again, we would require knowledge of CV for that). But it would completely fail if the Objects were intersecting in the SemSeg.

Anyway, If you could suggest me on a way I could create the bounding box of the objects in camera FOV, it would really be helpful.

@nsubiron in #242 I think I didn't frame my question well but what I wanted was not just filtering out of the objects in the scene but to get their information with respect to the scene. Example the tpX, tpY, W, H of bounding box of objects in scene in pixels.

YashBansod commented 6 years ago

@nsubiron @felipecode , are there any plans to enable instance segmentation?

I was able to get the bounding boxes as long as the pixels of the instances of a class are not intersecting. But I still can't get bounding boxes correctly if the pixels are intersecting.

mhusseinsh commented 6 years ago

Hello @YashBansod Did you manage to get the bounding box information for the objects only present in the scene ? I am trying to do the same as your task, using an object detector to detect the objects, so I need the GT info of the bounding boxes

I really tried to dig through this a lot, but I can't manage to do it

If you have any hints of how to do it, or part of code/function that can be added to the client example to log this info, it would really be appreciated

@nsubiron @felipecode any help here would be nice

YaduKini commented 5 years ago

Hi @mhusseinsh,

I am trying to solve the same problem. Did you find a solution to it? please let me know?_

kotchin commented 5 years ago

The client receives the camera pose , FoV and the positions of all the objects present in the scene. With this, one could compute a function that projects the camera view on the 3D space and get all the visible objects.

hi @felipecode, I'm deeply interested in what you're specifying above. Could you please elaborate on how this is done? Thank you!

germanros1987 commented 4 years ago

Problem solved in current versions of CARLA.

cinjon commented 3 years ago

@germanros1987 where do you see that this was solved?

fezsid commented 1 year ago

Can you please elaborate on how this was solved?