Open LeaFendd opened 12 months ago
This sounds awesome, I think it would be a great addition-- if you plan on submitting a PR for this, I'd be happy to help.
Would we also want another export method to generate an ortho photo of the scene?
This sounds awesome, I think it would be a great addition-- if you plan on submitting a PR for this, I'd be happy to help.
Would we also want another export method to generate an ortho photo of the scene?
Thank you for reply! I'd submitted a PR #2648 . You can generate an ortho photo with the ORTHOPHOTO CameraType like other CameraType, here is a render example with instant ngp.
But I don't how to integrate it into ns-viewer now, I'll try it later.
@LeaFendd I am trying to use your ortho camera with SDFStudio, but I get a matrix addition error due to a broadcasting problem at this line : coord_x = (coord_x + 0.5 - self.cx) / scale * self.fx
coord_x shape is [1000,1000], the size of one of my images, while self.cx is [91,1], the number of my images. I cannot understand how this operation can make any sense. To my understanding, we are supposed to get as a result a 2D matrix of size 91 (number of cameras) or 2048 (number of rays in my batch) in the first dimension, to then stack it to get to 3D and finally multiply it with c2w.
Could you please explain me the reasoning behind this operation and how I could make it work?
@LeaFendd I am trying to use your ortho camera with SDFStudio, but I get a matrix addition error due to a broadcasting problem at this line : coord_x = (coord_x + 0.5 - self.cx) / scale * self.fx
coord_x shape is [1000,1000], the size of one of my images, while self.cx is [91,1], the number of my images. I cannot understand how this operation can make any sense. To my understanding, we are supposed to get as a result a 2D matrix of size 91 (number of cameras) or 2048 (number of rays in my batch) in the first dimension, to then stack it to get to 3D and finally multiply it with c2w.
Could you please explain me the reasoning behind this operation and how I could make it work?
Sorry, I didn't receive your message in time.
This step is used to generate ray_origin
. First, a meshgrid is created on the xoy plane, represented by homogeneous coordinates (x, y, z, h). Then, the c2w
matrix is applied to transform the meshgrid to the actual camera position.
Since the orthographic camera is only used for rendering and not considered for training, I only considered the scenario where cx
and cy
have a single value when using it. In my usage, cx
and cy
are both a single Float rather than a vector, which led to a bug in broadcasting.
Thank you for your feedback, I will fix it as soon as I can.
You're welcome. I definitely think all cameras should be usable for both training and final render, otherwise it would become a bit confusing and go against the "plug'n play" interchangeability allowed by the modularity of the framework.
You're welcome. I definitely think all cameras should be usable for both training and final render, otherwise it would become a bit confusing and go against the "plug'n play" interchangeability allowed by the modularity of the framework.
yeah, I'm refactoring a more general version, give me some time
How is the refactoring going? I'm working on other parts of my model, but I'll probably be done with them by the end of next week. Do you think your new version will be available by then?
How is the refactoring going? I'm working on other parts of my model, but I'll probably be done with them by the end of next week. Do you think your new version will be available by then?
I just committed the new version in my PR #2648
@LeaFendd May I ask is there any plans to support for orthographic rendering of gaussian splatting? Or could you give some suggestions on how to implement this? I can help implementing it.
@LeaFendd May I ask is there any plans to support for orthographic rendering of gaussian splatting? Or could you give some suggestions on how to implement this? I can help implementing it.
Sorry, I don't have such plan recently.
You just need to construct an orthographic projection matrix to replace the perspective projection matrix.
I think you should modify here:
https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/models/splatfacto.py#L699
To implement an orthographic_projection_matrix()
function, you can refer to https://www.songho.ca/opengl/gl_projectionmatrix.html
Good luck!!!!!
@LeaFendd May I ask is there any plans to support for orthographic rendering of gaussian splatting? Or could you give some suggestions on how to implement this? I can help implementing it.
Sorry, I don't have such plan recently. You just need to construct an orthographic projection matrix to replace the perspective projection matrix. I think you should modify here: https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/models/splatfacto.py#L699 To implement an
orthographic_projection_matrix()
function, you can refer to https://www.songho.ca/opengl/gl_projectionmatrix.html Good luck!!!!!
@LeaFendd Thank you for your reply. Following your suggestion, i construct orghographic projection matrix as follows:
def projection_matrix(znear, zfar, fovX, fovY):
tanHalfFovY = math.tan((fovY / 2))
tanHalfFovX = math.tan((fovX / 2))
top = tanHalfFovY * zfar
bottom = -top
right = tanHalfFovX * zfar
left = -right
P = torch.zeros(4, 4)
z_sign = 1.0
P[0, 0] = 2.0 / (right - left)
P[0, 3] = - (right + left) / (right - left)
P[1, 1] = 2.0 / (top - bottom)
P[1, 3] = - (top + bottom) / (top - bottom)
P[2, 2] = -2.0 / (zfar - znear)
P[2, 3] = - (zfar + znear)/(zfar - znear)
P[3, 3] = z_sign
return P
And the rendered result is as follows: The result is orthographic since building's facade is invisible,but it is foggy, i think it is due to the lack of depth info, am i right and any suggestion to solve this? Than you!
@LeaFendd May I ask is there any plans to support for orthographic rendering of gaussian splatting? Or could you give some suggestions on how to implement this? I can help implementing it.
Sorry, I don't have such plan recently. You just need to construct an orthographic projection matrix to replace the perspective projection matrix. I think you should modify here: https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/models/splatfacto.py#L699 To implement an
orthographic_projection_matrix()
function, you can refer to https://www.songho.ca/opengl/gl_projectionmatrix.html Good luck!!!!!@LeaFendd Thank you for your reply. Following your suggestion, i construct orghographic projection matrix as follows:
def projection_matrix(znear, zfar, fovX, fovY): tanHalfFovY = math.tan((fovY / 2)) tanHalfFovX = math.tan((fovX / 2)) top = tanHalfFovY * zfar bottom = -top right = tanHalfFovX * zfar left = -right P = torch.zeros(4, 4) z_sign = 1.0 P[0, 0] = 2.0 / (right - left) P[0, 3] = - (right + left) / (right - left) P[1, 1] = 2.0 / (top - bottom) P[1, 3] = - (top + bottom) / (top - bottom) P[2, 2] = -2.0 / (zfar - znear) P[2, 3] = - (zfar + znear)/(zfar - znear) P[3, 3] = z_sign return P
And the rendered result is as follows: The result is orthographic since building's facade is invisible,but it is foggy, i think it is due to the lack of depth info, am i right and any suggestion to solve this? Than you!
Setting bigger z_near
may help, i think.
BTW, Happy Chinese New Year! :)
@LeaFendd May I ask is there any plans to support for orthographic rendering of gaussian splatting? Or could you give some suggestions on how to implement this? I can help implementing it.
Sorry, I don't have such plan recently. You just need to construct an orthographic projection matrix to replace the perspective projection matrix. I think you should modify here: https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/models/splatfacto.py#L699 To implement an
orthographic_projection_matrix()
function, you can refer to https://www.songho.ca/opengl/gl_projectionmatrix.html Good luck!!!!!@LeaFendd Thank you for your reply. Following your suggestion, i construct orghographic projection matrix as follows:
def projection_matrix(znear, zfar, fovX, fovY): tanHalfFovY = math.tan((fovY / 2)) tanHalfFovX = math.tan((fovX / 2)) top = tanHalfFovY * zfar bottom = -top right = tanHalfFovX * zfar left = -right P = torch.zeros(4, 4) z_sign = 1.0 P[0, 0] = 2.0 / (right - left) P[0, 3] = - (right + left) / (right - left) P[1, 1] = 2.0 / (top - bottom) P[1, 3] = - (top + bottom) / (top - bottom) P[2, 2] = -2.0 / (zfar - znear) P[2, 3] = - (zfar + znear)/(zfar - znear) P[3, 3] = z_sign return P
And the rendered result is as follows: The result is orthographic since building's facade is invisible,but it is foggy, i think it is due to the lack of depth info, am i right and any suggestion to solve this? Than you!
Setting bigger
z_near
may help, i think. BTW, Happy Chinese New Year! :)
HAHA, happy chinese new year! I have tried setting bigger z_near(eg z_near=60), but the result is the same, z_near and z_far seems not being used in the process of 3dgs.
Any update on this matter? How can I use ns-render to render an orthophoto?
@LeaFendd May I ask is there any plans to support for orthographic rendering of gaussian splatting? Or could you give some suggestions on how to implement this? I can help implementing it.
Sorry, I don't have such plan recently. You just need to construct an orthographic projection matrix to replace the perspective projection matrix. I think you should modify here: https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/models/splatfacto.py#L699 To implement an
orthographic_projection_matrix()
function, you can refer to https://www.songho.ca/opengl/gl_projectionmatrix.html Good luck!!!!!@LeaFendd Thank you for your reply. Following your suggestion, i construct orghographic projection matrix as follows:
def projection_matrix(znear, zfar, fovX, fovY): tanHalfFovY = math.tan((fovY / 2)) tanHalfFovX = math.tan((fovX / 2)) top = tanHalfFovY * zfar bottom = -top right = tanHalfFovX * zfar left = -right P = torch.zeros(4, 4) z_sign = 1.0 P[0, 0] = 2.0 / (right - left) P[0, 3] = - (right + left) / (right - left) P[1, 1] = 2.0 / (top - bottom) P[1, 3] = - (top + bottom) / (top - bottom) P[2, 2] = -2.0 / (zfar - znear) P[2, 3] = - (zfar + znear)/(zfar - znear) P[3, 3] = z_sign return P
And the rendered result is as follows: The result is orthographic since building's facade is invisible,but it is foggy, i think it is due to the lack of depth info, am i right and any suggestion to solve this? Than you!
How did you get this render? which part of the code you've changed?
@LeaFendd May I ask is there any plans to support for orthographic rendering of gaussian splatting? Or could you give some suggestions on how to implement this? I can help implementing it.
Sorry, I don't have such plan recently. You just need to construct an orthographic projection matrix to replace the perspective projection matrix. I think you should modify here: https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/models/splatfacto.py#L699 To implement an
orthographic_projection_matrix()
function, you can refer to https://www.songho.ca/opengl/gl_projectionmatrix.html Good luck!!!!!@LeaFendd Thank you for your reply. Following your suggestion, i construct orghographic projection matrix as follows:
def projection_matrix(znear, zfar, fovX, fovY): tanHalfFovY = math.tan((fovY / 2)) tanHalfFovX = math.tan((fovX / 2)) top = tanHalfFovY * zfar bottom = -top right = tanHalfFovX * zfar left = -right P = torch.zeros(4, 4) z_sign = 1.0 P[0, 0] = 2.0 / (right - left) P[0, 3] = - (right + left) / (right - left) P[1, 1] = 2.0 / (top - bottom) P[1, 3] = - (top + bottom) / (top - bottom) P[2, 2] = -2.0 / (zfar - znear) P[2, 3] = - (zfar + znear)/(zfar - znear) P[3, 3] = z_sign return P
And the rendered result is as follows: The result is orthographic since building's facade is invisible,but it is foggy, i think it is due to the lack of depth info, am i right and any suggestion to solve this? Than you!
How did you get this render? which part of the code you've changed?
@LeaFendd May I ask is there any plans to support for orthographic rendering of gaussian splatting? Or could you give some suggestions on how to implement this? I can help implementing it.
Sorry, I don't have such plan recently. You just need to construct an orthographic projection matrix to replace the perspective projection matrix. I think you should modify here: https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/models/splatfacto.py#L699 To implement an
orthographic_projection_matrix()
function, you can refer to https://www.songho.ca/opengl/gl_projectionmatrix.html Good luck!!!!!@LeaFendd Thank you for your reply. Following your suggestion, i construct orghographic projection matrix as follows:
def projection_matrix(znear, zfar, fovX, fovY): tanHalfFovY = math.tan((fovY / 2)) tanHalfFovX = math.tan((fovX / 2)) top = tanHalfFovY * zfar bottom = -top right = tanHalfFovX * zfar left = -right P = torch.zeros(4, 4) z_sign = 1.0 P[0, 0] = 2.0 / (right - left) P[0, 3] = - (right + left) / (right - left) P[1, 1] = 2.0 / (top - bottom) P[1, 3] = - (top + bottom) / (top - bottom) P[2, 2] = -2.0 / (zfar - znear) P[2, 3] = - (zfar + znear)/(zfar - znear) P[3, 3] = z_sign return P
And the rendered result is as follows: The result is orthographic since building's facade is invisible,but it is foggy, i think it is due to the lack of depth info, am i right and any suggestion to solve this? Than you!
How did you get this render? which part of the code you've changed?
Since several major update have been released (especially gsplat v1.0) after this issue, the method mentioned above for rendering orthophotos in gsplat
is no longer available.
I think it's time to initiate support for orthophoto rendering in gsplat
, but I'm busy job hunting for now. If all goes well, I expect to update in 3 weeks.
You can refer to the method mentioned above, by modifying the projection matrix, which requires you to modify the source in gsplat
, not in nerfstudio
.
Is your feature request related to a problem? Please describe. At present, nerfstudio does not support orthographic renderring. However, generating orthographic images requires merely producing a set of parallel rays for rendering for NeRF. This capability is essential for various applications but is currently lacking in nerfstudio. I plan to submit a PR for this feature.
Describe the solution you'd like I propose to extend the existing
Cameras
class by adding an orthographic camera model toCameraType
. This model would accept the same camera parameters as the perspective camera model already present in nerfstudio, allowing users to easily switch between perspective and orthographic modes.