daniel03c1 / masked_wavelet_nerf

MIT License
79 stars 5 forks source link

How long does it take for training? #5

Closed 9B8DY6 closed 1 year ago

9B8DY6 commented 1 year ago

I think with sparse representations, neural radiance field training speed would go up. But there is no results of training time. So I want to ask how long it takes for training.

Also, I can not understand the sentence in your paper. What is raw mask? 'wavelet coefficients with the same levels of sparsity have similar levels of sparsity' should be fixed into 'wavelet coefficients with the same levels of sparsity have the same levels of sparsity'

This is based on our experimental results that wavelet coefficients with the same levels of sparsity have similar levels of sparsity (Fig. 7).

Fig 7 is image

benhenryL commented 1 year ago

Thank you for your interest in our paper.

Although our model produces sparse representations, it does not mean that the number of parameters used for training is less. Namely, the number of the learnable parameters is irrelevant to the sparsity. This is because our method wipes out parameters based on masks, which just changes the values of coefficients to zero, not takes away the coefficients. Thus, the shape (size) of the coefficients is kept unchanged after masking out the coefficients. Since the number of the trainable parameters does not change, training time does not shrink even though our method can make sparse representations. Regarding the training time, it takes about 24 minutes to train our model with Tesla A100 on NeRF-synthetic dataset.

Raw mask indicates the raw real-value of the mask which is not binarized yet. In figure 7, raw mask (a) is binarized as shown in (b) and then used to determine whether corresponding coefficient would be used or zeroed out.

In terms of the expression “wavelet coefficients with the same levels of sparsity have similar levels of sparsity”, wavelet coefficients in the same level have similar levels of sparsity, not the same level of sparsity. Referring the notation of HL, LH from the previous sentence “For the 2-level wavelet transform, for instance, we group HL1, LH1, and HH1, then HL2, LH2, and HH2, and finally LL2”, although masks for the same level (i.e HL1, LH1, HH1) seem to have same sparsity when we visualize them (figure 7 (a)), they are not necessarily same. HL is a high frequency band to the horizontal axis and low frequency band to the vertical axis. Similarly, LH is a low frequency band to the horizontal axis and high frequency band to the vertical axis. Since each sub-band is used for different purposes, the mask for each sub-band is not exactly the same , which also indicates that its sparsity is not the same. However, as shown in figure 7 (a), we empirically found that wavelet coefficients in the same level have the similar level of sparsity. We now noticed that our expression is unclear, so we will clarify the sentence in the next version of the paper. Thank you for pointing it out.

9B8DY6 commented 1 year ago

Then raw mask is \hat{W}_r? I can not understand how you binarize raw masks. You binarize them with fixed threshold? image

daniel03c1 commented 1 year ago

That is true. \hat{W}_r denotes a raw mask! We used the Heaviside step function (denoted as H(.)) to binarize masks; that is, the threshold was set to zero.