xmed-lab / DIF-Net

MICCAI 2023: Learning Deep Intensity Field for Extremely Sparse-View CBCT Reconstruction
MIT License
52 stars 11 forks source link

About make projection drr and last op #7

Open bjy12 opened 2 months ago

bjy12 commented 2 months ago

Hello, your published work is very exciting. If possible, I would like to consult you on the following questions:

  1. Regarding the use of TIGRE for DRR data generation. In the process of creating DRR data, I saw that your code thresholds the data and then scales it to the range [0-255], before using TIGRE for DRR imaging. Could you please explain the purpose of setting the threshold? Have you considered converting CT values to attenuation values for projection?

  2. Regarding the use of F.leaky_relu() for the output of the last layer of the model. Is the use of Leaky ReLU intended to better fit the ground truth? Have you considered using sigmoid, since the range of the ground truth is [0,1]?

lyqun commented 2 months ago

Hi,

1.1.) The data range (i.e., thresholds as you said) can be customized. Here, we just choose a range such that most values (>99%) are in this range. The normalization (scale and quantize to 0-255 uint8) is for faster data loading. The data can be saved in uint16 or with more bits, but the loading will be slower and this (saving with more storage bits) may slightly affect the performance.

1.2.) CT value (Hounsfield unit scale) is a linear transformation of the attenuation coefficient. TIGRE calculates the projected value as the summation of values in the projecting ray. Due to the linear transformation, we have

$$\sum \mu = \sum f(v) = f(\sum v),$$

where $\mu$ is the attenuation coefficient, $v$ is the HU value and $\mu = f(v)$. Hence, it is ok to conduct TIGRE projection directly in the normalized HU scale as all the normalization is linear. We have updated a new version for data preprocessing, please refer to https://github.com/xmed-lab/C2RV-CBCT/tree/main/data.

2.) We did not apply the activation function (Leaky ReLU) in the output layer. Please see https://github.com/xmed-lab/DIF-Net/blob/main/code/models/point_classifier.py#L54. For your suggestion regarding the sigmoid, we will consider trying that, and thanks for your suggestion.

bjy12 commented 2 months ago

Thank you very much for your response. It was my mistake. Indeed, you did not use the leaky_relu function in the final layer. Since your predicted values are the HU values for each point in the entire CT field, which range from 0 to 255, you would prefer to use the leaky_relu activation function to control the output range within [0-255] to help it converge better, right?

lyqun commented 1 month ago

It (leaky_relu) can be better I guess, but we have not yet tried that.

bjy12 commented 1 month ago

Have you ever tried comparing the quality of reconstructed CT images after removing the metal bed from CT images and generating DRR images? Does it affect the reconstruction quality?

lyqun commented 1 month ago

Not yet and we are not sure about that.

bjy12 commented 2 days ago

Hi I have a question regarding the multi-scale feature querying process in your C^2RV framework, specifically about the handling of point coordinates in the MS-3DV (Multi-Scale 3D Volumetric Representations). When querying features for a point across different scales of the low-resolution volumetric representations, I'm curious about the coordinate system used:

Do you maintain the point coordinates in the original spatial coordinate system across all scales? Or do you scale the point coordinates to match the resolution of each scale in the multi-scale representation?