apple / ml-hypersim

Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding
Other
1.67k stars 129 forks source link

Fixed or variable focal length #75

Closed bryan-dailabs closed 1 month ago

bryan-dailabs commented 1 month ago

Is this dataset created with a fixed, or variable focal length? If fixed, what is the value or values?

I am using some models trained on this dataset (Depth Anything V2) and the metric depth estimations are not matching real world measurements. One of the issues may be that our camera has a different focal length / depth of field than the Hypersim data.

It seems that if the camera intrinsics from prediction don't match the dataset that the AI will not be able to predict accurate metric depth, but perhaps my reasoning is off.

Thanks for any pointers!

mikeroberts3000 commented 1 month ago

The camera intrinsics used in Hypersim are documented extensively in our README.

bryan-dailabs commented 1 month ago

Oh goodness, I'm not sure how I even missed that. Thanks, and apologies for the noise.

bryan-dailabs commented 1 month ago

Seems like FOV is hard-coded to pi/3, to match DIODE dataset, if someone else is looking for this: https://github.com/apple/ml-hypersim/blob/20f398f4387aeca73175494d6a2568f37f372150/code/python/tools/scene_generate_camera_trajectories_random_walk.py#L79

mikeroberts3000 commented 1 month ago

Indeed, but for any future readers that might find this thread, the exact camera intrinsics can vary slightly per scene. See here for details.