daniilidis-group / m3ed

M3ED Dataset
39 stars 3 forks source link

Missing IMU Calibration? #3

Closed klowrey closed 1 year ago

klowrey commented 1 year ago

While there is calibration data for the different cameras, I cannot locate the IMU calibration data mentioned in the data overview page https://m3ed.io/overview/ Guidance as to where to find this data would be appreciated; additionally, there doesn't seem to be a separate time map between IMU and the left event camera (or any time mapping for the IMU).

As a lower priority note, it appears that the data in the HDF5 files are inconsistently compressed; some data has had LZF applied, while some data is uncompressed. For instance, the IMU accelerometer/gyroscope data is uncompressed, while the IMU timestamps are. While LZF may be the default for h5py, it 1) isn't the most space efficient compression (and thus slow download from AWS) and 2) isn't available in all distributions of HDF5 libraries.

fcladera commented 1 year ago

Hi @klowrey, thanks for your interest in M3ED!

While there is calibration data for the different cameras, I cannot locate the IMU calibration data mentioned in the data overview page https://m3ed.io/overview/ Guidance as to where to find this data would be appreciated;

We are currently uploading v1.1 of the dataset including the calibration files for the IMU. As the calibration is performed only between the OVC imagers, we did one calibration for the tower (car/spot) and one calibration for the UAV, and used these for all the sequences. This data will be uploaded shortly.

additionally, there doesn't seem to be a separate time map between IMU and the left event camera (or any time mapping for the IMU).

We are focusing on cleaning up our code for release, and we hope to add this mapping in v1.2.

As a lower priority note, it appears that the data in the HDF5 files are inconsistently compressed; some data has had LZF applied, while some data is uncompressed. For instance, the IMU accelerometer/gyroscope data is uncompressed, while the IMU timestamps are.

Thanks for mentioning this. We will make sure that all the compression schemes are consistent for the upcoming release.

While LZF may be the default for h5py, it 1) isn't the most space efficient compression (and thus slow download from AWS) and 2) isn't available in all distributions of HDF5 libraries.

We did have an internal discussion regarding which compression format to use. While gzip provides higher compression rates, the decompression time is bigger (https://www.h5py.org/lzf/). Is this a major issue for you?

klowrey commented 1 year ago

Some of the newer compression methods like LZ4 or Zstandard could be even better and are supported within HDF5 files. There's an argument to be made for you guys to compress as much as possible for distribution, and allow other people to recompress (or not compress at all) for faster decompression. Gzip would work for me, but I think a) different folks may have different requirements and b) you'd just have to benchmark anyway.

Given that these datasets are large enough that they may not entirely fit into a computers main memory, reading from disk will be necessary, but computers are powerful enough that decompression will likely not be the bottleneck in that case (reading from disk will be). As such, it'd be important to ensure that the data is stored contiguously so that it can be memory mapped for faster access -- I'm not sure how this is done in with Python's HDF5 library, but simple to do in Julia https://juliaio.github.io/HDF5.jl/stable/#Memory-mapping and offers a big performance speed-up when reading from disk.

fcladera commented 1 year ago

Hi @klowrey. We wanted to follow-up after the release of v1.1.

All calibrations have been released, and all the data files include the calibrations embedded. We created #5 to address your comments regarding compression.

Thank you for your feedback, it is really important for us!