Closed WillYao-THU closed 4 months ago
We did not plan to include them because we wanted to repo to only contain the functionality used by the method.
"Linear - Tri-Plane" is straightforward to implement. Here, we don't need the steerable pyramid, and we just compute: T1_mag = T1 + alpha * (T1 - T0) directly on the triplane values.
"Position Shift" and "Encoding Shift" depend on a non-triplane NeRF backbone (including them would make this repo too complicated). Basically, it would require training the NeRF using the original MLP setup: (x, y, z) -> PosEnc(x, y, z) -> MLP(PosEnc(x, y, z)) -> (RGB, opacity). With "Position Shift", we would only tweak (x, y, z) and freeze everything else for later timesteps. With "Encoding Shift", we would only finetune the Positional Encoding output. The finetuning essentially learns a residual term to add back to the original value. After all training, magnification is obtained by linearly amplifying the differences characterized by the residuals.
Let me know if you need further clarifications. Thanks.
Thank you so much for your detailed explanation. Your clarification on the implementation was incredibly helpful. Best regards!
Hi,
I wonder if the release codes include implementations for "Linear – Tri-Plane" ,"Position Shift" and "Encoding Shift" methods. It seems that only the "Phase – Tri-Plane" method is implemented.
Looking forward to your reply, thanks!