The role of the parameter 'scale' in the training mode and the 'scales' in the testing mode

NarcissusEx / CuNeRF

[ICCV2023] CuNeRF: Cube-Based Neural Radiance Field for Zero-Shot Medical Image Arbitrary-Scale Super Resolution

https://narcissusex.github.io/CuNeRF/

MIT License

33 stars 2 forks source link

The role of the parameter 'scale' in the training mode and the 'scales' in the testing mode #7

Open 11710615 opened 4 months ago

11710615 commented 4 months ago

I still have two questions about the training. 1: what' s the input of the training, fully-sampled HR volume or LR volume? The question stems from the fact that the code has downsampled the h and w directions according to the parameters 'scale' in the 'dataset.py'. xy_inds = torch.meshgrid(torch.linspace(0, self.H - 1, self.H // self.scale), torch.linspace(0, self.W - 1, self.W // self.scale)) 2: what' s the role of the parameter 'scale' in the training mode and the 'scales' in the testing mode? If the input is the LR volume, the model training is converged based on the existing sparse sampling points. It seems that 'scale' is no used in the training process.

Thanks for your fast reply again.

NarcissusEx commented 4 months ago

1) In training stage, our input is a LR volume downsampled the HR one by the 'scale'. After training, we will evaluate the SR performance. 2) In test stage, we assume that checkpoint is the model trained by setting scale=1, and then render the upsampled views related to the scales.

So, the scale in train/eval denotes the downsampling scale while the scale in test denotes the upsampling one. I apologize for causing you any misunderstanding.

11710615 commented 4 months ago

I understand that the framework reads a HR volume and downsamples it to the LR one by the parameter 'scale'. But I wonder where the codes represent the downsampling process. Is it the code bellowed? xy_inds = torch.meshgrid(torch.linspace(0, self.H - 1, self.H // self.scale), torch.linspace(0, self.W - 1, self.W // self.scale)) It seems that this code downsamples the HR volume on x and y direction without changing on z direction, which is opposite to the setting of 'volumetric MISR' in the paper. Thank you for your patience in replying to my questions.

NarcissusEx commented 4 months ago

We decouple 3D downsampling into xy plane downsampling and z downsampling. Please see https://github.com/NarcissusEx/CuNeRF/blob/9da7b0cdaf5a223ae4dbee205f3c4c7da1f2b6f6/src/dataset.py#L68.

11710615 commented 4 months ago

z_ind = index // self.scale * self.scale It seems the code ensures index is an integral multiple of self.scale. There is no use on downsampling the z direction. Meanwhile, the index is in the range(self.LEN) which equal to the self.data.shape[0], the possible reason is that the framework reads the LR volume directly, which has been downsampled along the z direction.

NarcissusEx commented 4 months ago

Huh, maybe you can try the following codes: scale = 4 img_inds = list(range(21)) print ([i // scale * scale for i in img_inds]), which can only select the index belong to LR volume.

11710615 commented 4 months ago

I misunderstood the code. Thank you very much for your patient reply.

byungjur96 commented 1 month ago

Hi I have similar questions about code. As far as I understand, in case of 3DSR, the input is HR for metric calculation, and 'scale' downsamples input volume into LR volume, which is a real input into a model to be supersampled. In case of CT volumetric super resolution which only supersamples in z-axis, how can we manipulate input and scale value?