Closed Rashfu closed 4 months ago
Intuitively there are fewer parameters in the reconstructed model. And, unlike 4DGS we have views from many angles, in colonoscopy, we always have nearly fixed angle of view. If we consider view-dependent color, we have to first model light and even tissue texture. These can be future work for more comprehensive modeling.
Thanks for your prompt and detailed explanation!
Because I encountered issues while reproducing the metrics, I did not open a new issue.
optimize pose at half-resolution, refine every 2 frames
. Are there corresponding hyperparameters that can be modified?Thanks in advance.
In the base config,
desired_image_height=1080//2,
desired_image_width=1350//2,
here //2
is because the resolution becomes half during the preprocessing. To change it to half resolution, just set
tracking_image_height=1080//4,
tracking_image_width=1350//4,
and set map_every = 2
to reifne every 2 frames.
The time is calculated with duration weight. The average time is calculated not by just averaging 10 sequences, but by all time / all frames
. As far as I could remember the time is slightly better using per frame average...
I save the time per scene in time.txt
. I calculated the final results with a simple python calculator. If you want to compare, you can use any fair comparison method.
Thanks for your explanation. I will try to reproduce the results and update here.
I used the released _EndoGSLAMrecons.zip and multiplied the Average Tracking/Frame Time
and Average Mapping/Frame Time
from runtime.txt for each scene by the total number of frames in the corresponding scene to obtain the total duration for each scene. Finally, I added and divided them by the total number of frames (summing the number of lines in all _eval/estw2c.txt files). However, my speed results are significantly higher than those reported for EndoGSLAM-H
in the paper.
Here are the results:
Update: I ran EndoGSLAM-H using my machine(RTX3090 + Intel Silver CPU(not a good choice I guess) ) and got the following results:
So, did I miss something or misunderstand your explanation?
Tips:💡 The quantitative analysis in the paper can be effectively reproduced. Although there may be some errors, the overall consistency is maintained.
This is one of my recorded results. I did run on RTX2080Ti+AMD TR-2950X (16 core) as well, and the frame time for cecum_t1_b is 0.8375244798944957, which seems faster than your RTX 3090. This is abnormal. I am using the code I pushed. I could only thought of possibilities regarding cuda/torch/rasterizer version. I knew some better implementation can improve rasterization 30% more but here you appear slower...
My frame time for cecum_t1_b is 0.7837918445245544(0.29460237097384323+0.48918947355071113), which is faster than your RTX2080ti. And I think software-driven versions do have a huge impact on speed. I just think there is a big difference between the speed of EndoGSLAM-G in the paper and mine. Never mind. I reproduced all metrics apart from EndoGLSAM-H speed.
Great work.
OK。。看错行了。 And maybe RTX4090 just run 3DGS faster? Anyway.
Hi author, I am confused by the following sentence in the article:
We first replace SH coefficients with a color attribute c based on the fact that lighting primarily moves with the camera in endoscopy, reducing the need for complex view-dependent effects modeling.
Could you please explain why this reduces the need for modeling?