Open aaalloc opened 8 months ago
Can you detail more what would be the setup? Would there be multiple static monocular cameras on pan/tilt servo around a room? Not sure rtabmap would ideal, as rtabmap is targeting primarily mobile robots/cameras by default. Do you need a 3D reconstruction or just detecting/tracking 3D pose of objects in the space?
For the title, you cannot use multiple monocular cameras unless you can generate depth for them somehow (from depth cameras or lidar).
Would there be multiple static monocular cameras on pan/tilt servo around a room?
Yes that's the idea
just detecting/tracking 3D pose of objects in the space?
Yes, when a object is detected and localised other cameras needs to be aware of that too. I thought that maybe for doing that I would need to have a cloud map made from distinct cameras and be shared across all cameras but I don't know ... Another idea was to do Stereo depth mapping, but I have no leads to do that for more than 2 cameras
you cannot use multiple monocular cameras unless you can generate depth for them somehow (from depth cameras or lidar).
I can somehow manage to generate depth cameras with AI Model like Depth-Anything or ZoeDepth but that would be expansive, so I thought maybe there is a way of doing that without it by the fact that I can have multiple view from a room
To reconstruct 3D space from multiple static monocular cameras, that can be an hard/expensive task. The system I am thinking requires > 10 cameras for quite small volume. They would do photogrammetry offline or even real-time, but many point of views looking the same thing are required.
That's why I asked if you need 3D reconstruction or just tracking. For tracking, the system could look like more like OptiTrack or VICON systems can do, though they require specific targets (like small spheres used for motion capture). Once the cameras know where they are relative from each other in a space, you can then track the same object across them.
AI depth with monocular camera would give you depth but without scale. With stereo cameras, you would get true depth. If you place 3 stereo cameras on pan/tilt around a room, after you give the relative TF between all of them, you would not need a SLAM package as you would know the global position/rotation of each camera at any moment in the the same global frame. To scan in 3D, you may just accumulate the point clouds while the cameras are rotating. The most difficult part is to correctly calibrate the extrinsics between all the stereo cameras (i.e., their accurate relative position), so that point cloud generate from one camera can overlaps correctly with point cloud computed from another camera looking at the same thing.
With stereo cameras, you would get true depth.
Unfortunately I can't replace monocular cameras to stereos cameras, but, do you think it is feasible to make stereo camera based on 2 monocular cameras far apart like this ?
Theoretically yes, though practically difficult to setup and calibrate.
This is what I was thinking you were doing (tracking an object from multiple monocular cameras in a room):
Interesting survey: "Multi-camera multi-object tracking: A review of current trends and future advances" https://www.sciencedirect.com/science/article/pii/S0925231223006811
Hi, I have multiple questions.
I'm currently doing a project where I have multiple IP cameras and I need to be detect and locate objects. Is rtabmap suitable for this ?
I've seen this repository : https://github.com/ROBOTIS-JAPAN-GIT/turtlebot3_slam_3d and this is exactly what I need to do, but with multiple stationary cameras (pan/tilt is possible). I've seen that it is possible to only use RGB and depth image according to this https://github.com/introlab/rtabmap/issues/1071.
My goal now is to be able to make things work with only one camera and see if I can expand it later but is that possible to do with rtabmap_ros ? If yes, is it possible to give some example of how doing that ? I've understand that i need to give topics to these fields : camera_info_topic, depth_topic, rgb_topic and I have already prepared a ROS node to publish Image message, its just that don't really understand how can I plug things together to make it work.