daniel03c1 / masked_wavelet_nerf

MIT License
79 stars 5 forks source link

Description: Relative work in Paper #13

Open zwxdxcm opened 11 months ago

zwxdxcm commented 11 months ago
image

I am wondering if this description is correct? Why there is not QAT in blue highlight?

Thanks!

zwxdxcm commented 11 months ago

I mean ... If you wanna train neural field from scratch, it is reasonable to implementing E2E compression, which should use QAT. For pre-trained model, it would be better to use PTQ.

daniel03c1 commented 11 months ago

Sorry for the confusion. You are right, and we were incorrectly written. Thank you.

zwxdxcm commented 11 months ago

Sorry for the confusion. You are right, and we were incorrectly written. Thank you.

Thanks for your reply. Double check, so this work implement QAT since it trains network from scratch. Have you compare how many time spent on the additional computations during both training and inference?

zwxdxcm commented 11 months ago

And I also have another 2 questions.

  1. This work's target is to reduce memory(running time storage) instead of model storage, right? So the depicted in experiments part is memory?
  2. I am wondering how to design the mask process? Is there any relative work? cause I cannot understand it LOL. image
image

Thank you !

zwxdxcm commented 11 months ago

Oh i understand it seems like a threshold function. but why would you use stop gradient operator?

daniel03c1 commented 11 months ago

Thank you for your interest in our work.

  1. Regarding the time, it only increases the training time, and the exact training times are described in the supplementary. During inference, wavelet coefficients only need to be converted once, which takes a constant time to depack sparse representations into spatial grids, and thus the inference time is equal to the original model without mask or wavelet transformation.
  2. The work only considers storage memory. Regarding the memory, it might need extra memory during training, but during inference, it will require as much as the original model requires.
  3. I believe this issue is related to #10. If you have further questions, please feel free to ask.
zwxdxcm commented 11 months ago

OK. Thanks ~~

zwxdxcm commented 11 months ago

For what I understood, the masking part is more like a learnable frequency filter, am I correct?

daniel03c1 commented 11 months ago

It is not necessarily a frequency filter. The masking method itself can filter whatever you want. For example, if you apply the masking method to spatial grids, it filters spacial coefficients. If you apply it to frequency grids (after DCT), you get frequency filters.

zwxdxcm commented 11 months ago

Thanks for your reply !