InternLandMark / FlashGS

MIT License
64 stars 2 forks source link

Questions about con_o and exp #3

Closed forresti closed 3 weeks ago

forresti commented 4 weeks ago

The more I dig into your code, the more interesting it is!

Couple of questions:

Question 1: con_o sign and multiplier

On this line: https://github.com/InternLandMark/FlashGS/blob/a9d0ca05f8fb67c0ffa0ca92fb3c15436b3da016/csrc/cuda_rasterizer/render.cu#L41-42

The code is:

    //float power = -0.5f * (con_o.x * d.x * d.x + con_o.z * d.y * d.y) - con_o.y * d.x * d.y;
    float power = con_o.w + con_o.x * d.x * d.x + con_o.z * d.y * d.y + con_o.y * d.x * d.y;

What's your thought-process for making this change? Elsewhere in the code, do you change the signs to make this work? And, do you scale con_o.x and con_o.z by 0.5 elsewhere?

Question 2: exp

On this line: https://github.com/InternLandMark/FlashGS/blob/a9d0ca05f8fb67c0ffa0ca92fb3c15436b3da016/csrc/cuda_rasterizer/render.cu#L44

My question is: Is your goal to to compute 2^x or e^x? If you're doing e^x, then do you the change-of-base between 2 and e elsewhere in the code?

谢谢

forresti commented 3 weeks ago

Aha, I think the answer to both my questions is here:

https://github.com/InternLandMark/FlashGS/blob/a9d0ca05f8fb67c0ffa0ca92fb3c15436b3da016/csrc/cuda_rasterizer/preprocess.cu#L509

chensiyan96 commented 3 weeks ago

Since this part is one of the key floating-point computation bottlenecks, I did pre-scaling and assembly level optimization here.