microsoft / lamar-benchmark

Source code for the ECCV 2022 paper "Benchmarking Localization and Mapping for Augmented Reality".
Creative Commons Attribution 4.0 International
362 stars 34 forks source link

Rendering navvis mesh to dense depth maps in other (iOS/hl) sessions #49

Open ziruiw-dev opened 8 months ago

ziruiw-dev commented 8 months ago

Hi LAMAR dataset authors,

Thanks for making and releasing this dataset.

I am wondering is it possible to render dense depth maps to an iOS session using the mesh provided in a navvis session? i.e. projecting mesh using iOS trajectories? If I understand correctly, it seems like some registration files (dir location1/registration) and alignment files (dir hololens1/sessions/proc/alignment and phone1/sessions/proc/alignment)... are not provided in the current raw data release?

The planed data structure in CAPTURE.md is shown below:

location1/                                  # a Capture directory
├── sessions/                               # a collection of Sessions 
│   ├── navvis1/                            # NavVis Session #1
│   │   ├── sensors.txt                     # list of all sensors with specs
│   │   ├── rigs.txt                        # rigid geometric relationship between sensors
│   │   ├── trajectories.txt                # pose for each (timestamp, sensor)
│   │   ├── images.txt                      # list of images with their paths
│   │   ├── pointclouds.txt                 # list of point clouds with their paths
│   │   ├── raw_data/                       # root path of images, point clouds, etc.
│   │   │   ├── images_undistorted/
│   │   │   └── pointcloud.ply
│   │   └── proc/                           # root path of processed assets
│   │       ├── meshes/                     # a collections of meshes
│   │       ├── depth_renderings.txt        # a list of rendered depth maps, one per image
│   │       ├── depth_renderings/           # root path for the depth maps
│   │       ├── alignment_global.txt        # global transforms between sessions
│   │       ├── alignment_trajectories.txt  # transform of each pose to a global reference
│   │       └── overlaps.h5                 # overlap matrix from this session to others
│   ├── hololens1/
│   │   ├── sensors.txt
│   │   ├── rigs.txt
│   │   ├── trajectories.txt
│   │   ├── images.txt
│   │   ├── depths.txt                      # list of depth maps with their paths
│   │   ├── bluetooth.txt                   # list of bluetooth measurements
│   │   ├── wifi.txt                        # list of wifi measurements
│   │   ├── raw_data/
│   │   │   ├── images/
│   │   │   └── depths/
│   │   └── proc/
│   │       └── alignment/
│   └── phone1/
│       └── ...
├── registration/                           # the data generated during alignment
│   ├── navvis2/
│   │   └── navvis1/                        # alignment of navvis2 w.r.t navvis1
│   │       └─ ...                          # intermediate data for matching/registration
│   └── hololens1/
│   │   └── navvis1/
│   └── phone1/
│       └── navvis2/
└── visualization/                          # root path of visualization dumps
    └─ ...                                  # all the data dumped during processing (TBD)

Some extra context: I am currently using scantools/run_sequence_rerendering.py and my plan is to

  1. [A2A] render the mesh from navvis session A to dense depth maps using trajectories in navvis session A;
  2. [A2B] render the mesh from navvis session A to dense depth maps using trajectories in navvis session B;
  3. [A2C] render the mesh from navvis session A to dense depth maps using trajectories in iOS session C.

I manage to get the A2A working perfectly, A2B working okay (it seems like there are some surface normal direction issues occasionally), but I get stuck at step A2C. I am wondering if I could get some advices or example code?

Best, Zirui

sarlinpe commented 8 months ago

Ground truth poses are only released as part of the evaluation data subset for keyframes in mapping & validation sequences. We did not release any poses as part of the raw data because we do not want to expose them for any sequence that includes test queries. So: 1) You can only use GT poses of mapping or validation sequences but not of test sequences. 2) If you need poses for all timestamps (not only for keyframes), you would need to interpolate them from the nearest keyframes, e.g. with a linear model. We do not have code for this but it would be a valuable addition to the Pose object. As a starting point, check out scipy.spatial.transform.Slerp for rotation interpolation.

@mihaidusmanu We could consider releasing full-framerate poses for all sequences (excluding test sequences) if there is a need for it - it would make 2) unnecessary.

ziruiw-dev commented 8 months ago

Hi @sarlinpe, Thanks a lot for the reply. Very helpful and I got it rendered to validation splits successfully.