Closed railgun526 closed 7 months ago
Generally, intrinsics are provided in units of pixels in the following format:
[[fx, 0, cx],
[ 0, fy, cy],
[ 0, 0, 1]]
The intrinsics pixelSplat uses are normalized, i.e., the first row is divided by w
, and the second row is divided by h
. This makes it so the intrinsics don't change if you uniformly scale the image.
I'm not sure what format ScanNet uses, but if the ScanNet intrinsics are in units of pixels, the same division should apply. The slightly suspicious thing is that if you do this, cx
and cy
aren't 0.5 anymore, which is the case for almost all uncropped images. Are you sure the ScanNet intrinsics haven't been modified somehow? What size are the images these intrinsics correspond to?
As for the near and far planes, you can ignore the ones set inside the dataset itself. These are only used by the baselines, which don't automatically choose near and far planes. pixelSplat will automatically pick near/far planes using the code in src/datasets/shims/bounds_shim.py
.
Thank you, it works!
Generally, intrinsics are provided in units of pixels in the following format:
[[fx, 0, cx], [ 0, fy, cy], [ 0, 0, 1]]
The intrinsics pixelSplat uses are normalized, i.e., the first row is divided by
fx
, and the second row is divided byfy
. This makes it so the intrinsics don't change if you uniformly scale the image.I'm not sure what format ScanNet uses, but if the ScanNet intrinsics are in units of pixels, the same division should apply. The slightly suspicious thing is that if you do this,
cx
andcy
aren't 0.5 anymore, which is the case for almost all uncropped images. Are you sure the ScanNet intrinsics haven't been modified somehow? What size are the images these intrinsics correspond to?As for the near and far planes, you can ignore the ones set inside the dataset itself. These are only used by the baselines, which don't automatically choose near and far planes. pixelSplat will automatically pick near/far planes using the code in
src/datasets/shims/bounds_shim.py
.
Hi @dcharatan , the scannet intrinsics are in units of pixels, and they are correspond to image size of (h,w)=(968,1296). But why the intrinsic is devided by (fx, fy)? I think it should be devided by (w, h) according to README. I devide the intrinsic[:2] with (w,h), and plot the epipolar line as below:
In addition, I noticed that cx
and cy
is always 0.5 in your data. But after the devision of scannet intrinsic, cx
and cy
won't be 0.5. Will it be a problem?
@Pixie8888 You're right. The README is correct, and what I originally wrote above (dividing by fx
and fy
instead of w
and h
) was a mistake/typo--sorry about that! In general, having cx
and cy
be a value that's not 0.5 isn't inherently a problem, but it should make you double-check that the intrinsics are correct. As for as the pixelSplat code is concerned, since it's based on diff-gaussian-rasterization
, it only supports a principal point in the center of the image at (0.5, 0.5). As a workaround, I would suggest cropping the input images so the principal point is at the image center and then adjusting the intrinsics accordingly.
Hi @dcharatan, first I'd like to thank you on your excellent work and congratulate on the CVPR acceptance of your paper! Recently I am working on implementing pixelsplat on scannet dataset, but I have met with a few problems. The one is about intrinsic matrix. For scannet, the intrinsic matrix looks like this: tensor([[[[72.2301, 0.0000, 39.9825],[ 0.0000, 72.2301, 29.8596],[ 0.0000, 0.0000, 1.0000]]) but for this repo on re10k, it looks like [0.8900, 0.0000, 0.5000], [0.0000, 0.8902, 0.5000], [0.0000, 0.0000, 1.0000]. I am wondering if you have made any modifications on the original intrinsic matrix, like changing the unit from millimeter to pixel? The other is that I cannot fully understand what "far" and "near" plane mean in this setting. I had seen your comments in #25 but I still didn't know how to set the distance to make disparity negligible. Could you please provide some example codes? I'd appreciate it if you could help me!