Running maplab 2.0 on monocular video with no IMU

neilsorkin19 commented 1 year ago

I have successfully built maplab and I want to use it to process a camera stream without IMU. I have /camera/image_raw and /camera/camera_info topics. My /camera/image_raw runs at 120 hz. More specifically, my data comes from a video game, Valorant. Here is an example of an input frame (I trimmed most of the frame at the top and bottom to cut out static information displayed on screen, so that's why it's very wide):

0015

The in game FOV is 103 degrees. I think with this very high frame rate, "global shutter", and lack of camera roll motion, it should be fairly doable. I did try lsd_slam and it worked with some tweaking, but it would not close loops very easily and once lost, it would make a mess.

In the maplab 2.0 paper under the "Mapping Node" section you say "maplab 2.0 does not even require an IMU", so I am hopeful that my use case can be accomplished.

Can you advise me on how I can get started with just the /camera/image_raw and /camera/camera_info topics? My objective is to build a VI-Map very thoroughly while in a custom game with no other players, but then once in a game with other players, I want to localize only and track player movement throughout the map. All processing can be done offline, but I would prefer online localization.

Thanks.

smauq commented 1 year ago

Hi! Thanks for the interest, and sorry for the slow answer, I'm on vacation. Your biggest current issue is that you still need an odometry method (i.e. something to initialize the poses in the graph). You can afterwards use maplab for loop-closure, multisession mapping, and batch optimization.

The odometry method that we happen to provide alongside, is rovioli, which is VIO only. Have you tried Orbslam2 or DSO, for example? If you make either one of those work, you can come back and I can go into details about the multisession mapping.

neilsorkin19 commented 1 year ago

I did try orb_slam_2_ros with some success. How can I proceed with that for multisession mapping?

Thanks.

cy-2022 commented 1 year ago

Hi there, sorry that obviously this is not to provide any helpful answers. I just happened to read this issue and was so glad to see such an interesting application of visual SLAM. So were you using a camera stream from a video game? In that case, how would the camera intrinsics (ususally published via /camera/camera_info) be make available? Thanks a lot!

neilsorkin19 commented 1 year ago

I actually ran some game collected images (frames pulled from game footage looking at roughly the same area but moving around to give some different view angles) through COLMAP and it gave me some camera intrinsics which I put into the right format which was published via /camera/camera_info. You can read more about the settings here: https://colmap.github.io/cameras.html. It has been a while since I worked with it, but I think you can take what COLMAP outputs and insert it into a camera.yaml file. You would have to figure out which parameters to change, but see the format of the camera.yaml file below:

image_width: 640
image_height: 272
camera_name: "camera"
camera_matrix:
  rows: 3
  cols: 3
  data: [265.07249514822331, 0., 320, 0., 265.07249514822331, 136, 0., 0., 1.]
camera_model: "plumb_bob"
distortion_coefficients:
  rows: 1
  cols: 5
  data: [0, 0, 0, 0, 0]
rectification_matrix:
  rows: 3
  cols: 3
  data: [1, 0, 0, 0, 1, 0, 0, 0, 1]
projection_matrix:
  rows: 3
  cols: 4
  data:
    [
      265.07249514822331,
      0.,
      320,
      0.,
      0.,
      265.07249514822331,
      136,
      0.,
      0.,
      0.,
      1.,
      0.,
    ]

ethz-asl / maplab

Running maplab 2.0 on monocular video with no IMU #373