Recreating Online Arxiv Paper Results for TUM-VI

ArmandB commented 2 years ago

Love the paper, thank you so much for putting it and the code out there!

When I was trying to recreate the paper results, I noticed that my EUROC results match but TUM-VI did not. Looking at the paper, I found: Table B2 online-stereo-Normal SLAM has identical RMSE to postprocess (Table 4)

I suspect that this is just a typo although I could be wrong here.

Cheers and all the best!

oseiskar commented 2 years ago

Hello!

Good point, thank you for reporting. I think it's not really a typo, but in Table 4 we may have actually compared our online SLAM result to the postprocessed results for other methods, because our postprocessing did not work well with that dataset. This is OK in the sense that online methods are a subset of postprocessed methods (those with zero post-processing), but we should have written this more clearly in the paper. (also FYI @pekkaran)

Attached the tables here for reference:

ArmandB commented 2 years ago

Thanks for your fast response!

That's good to know b/c I'm running TUM w/ parameters: "-maxSuccessfulVisualUpdates=20 -useStereo -useSlam -timer=true" (like the Normal SLAM parameters in the paper) and am getting some different RMSEs than what the paper reported. Room1 - "RMSE": 0.02413978985557401 Room2 - "RMSE": 0.027370991450391104 Room3 - "RMSE": 0.014600381138423596 Room4 - "RMSE": 0.017374768510435266 Room5 - "RMSE": 0.02477042526419998 Room6 - "RMSE": 0.022124688783236424 I ran w/ the TUM default settings in compare_to_others.py (-maxSuccessfulVisualUpdates=5) and results didn't match as well.

My fork is based off of commit: 3a46cd6c instead of e325353 so that could be part of the issue. I didn't see any changes to the src code between those two commits, but there could be something I'm missing. I can also try and look at a --postprocess run and see how the error compares. I'd have done it before, but there was some python error with running with the --postprocess run that I needed to sort through first. I'll report what the issue ends up being if I figure it out.

pekkaran commented 2 years ago

Hello.

Omitting -maxSuccessfulVisualUpdates=20 and using the default 5 is correct for the TUM data.

Did you obtain your RMSE numbers using the --outputDir argument of compute_paper_results.py? The metric computation is somewhat complex to reproduce using the vio_benchmark tools and/or HybVIO main binary manually. For example vio_benchmark implements multiple metrics and the default one isn't the same as the one used in the paper (SE3 RMSE, set for TUM here).

Looking at the TUM data conversion script, one thing I notice is that it uses the lower resolution version by default. Changing this line to use 1024 might improve the results (or do the opposite), although I'm rather sure we tested reproduction of the paper results including downloading the data from scratch with the provided scripts.

Finally, we sometimes had difficulties reproducing the exact results across different machines. For example, the test data is saved as videos which are decompressed using system install of FFmpeg (via OpenCV functions). A different version of FFmpeg or other related operating system differences could result in the decompressed images being slightly different and causing a butterfly effect in the VIO results. However, if you were able to produce exact same numbers with EuRoC as we did (up to the precision we reported), then maybe it would be plausible to do the same with the TUM data.

The commit you used for your fork should indeed produce the same results.

ArmandB commented 2 years ago

Your intuition was correct. Using the 1024x1024 resolution TUM-VI images and omitting -maxSuccessfulVisualUpdates=20 allowed me to recreate the online results to the precision that you reported. Thank you guys so much!!! I will change the title of this issue and also put some of my system/ffmpeg information below in case it helps posterity:

ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1) configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared libavutil 56. 31.100 / 56. 31.100 libavcodec 58. 54.100 / 58. 54.100 libavformat 58. 29.100 / 58. 29.100 libavdevice 58. 8.100 / 58. 8.100 libavfilter 7. 57.100 / 7. 57.100 libavresample 4. 0. 0 / 4. 0. 0 libswscale 5. 5.100 / 5. 5.100 libswresample 3. 5.100 / 3. 5.100 libpostproc 55. 5.100 / 55. 5.100

SpectacularAI / HybVIO

Recreating Online Arxiv Paper Results for TUM-VI #29