arcadelab / deepdrr

Code for "DeepDRR: A Catalyst for Machine Learning in Fluoroscopy-guided Procedures". https://arxiv.org/abs/1803.08606
GNU General Public License v3.0
205 stars 59 forks source link

question about the segmentation of bone vs soft tissue #10

Closed pc4653 closed 4 years ago

pc4653 commented 4 years ago

Hi Prof. Unberath,

Thanks for the awesome repo. I have a question in terms of separating the bone and soft tissue structure. In the image shown in the Readme (as well as my testing on the LIDC dataset), it seems like the bone segmentation map does not cover the entirety of, e.g. spine columns, but rather just the outer ring of them. As a result, the spine is very visible from the soft tissue render. I am wondering if this is by design or due to the imperfection in segmentation?

I attached a bone segmentation image below to better illustrate:

seg0_bone

Many thanks, Cheng

mathiasunberath commented 4 years ago

Thank you for using DeepDRR. The segmentation is usually the most critical part, and unfortunately, it is not easy. For now, I will only focus on threshold-based segmentation because for CNN-based segmentation there may be more issues (such as different CT scanners, reconstruction protocols, ... that may affect generalization of the CNN).

When using threshold-based segmentation, setting the HU threshold for bone is challenging because the magnitude depends on the bone density of the individual. Additionally, due to the finite resolution of CT scans, trabecular bone will exhibit partial volume effects which will result in a lower HU value (average of bone and other materials that occupy the same voxel).

Really what one would want is either much higher resolution CT scans such that this can be resolved, or another segmentation label for trabecular bone (which cannot be done with thresholds as it would be ambiguous). We found that selecting HU thresholds such that the cortical bone is well segmented but the trabecular bone is not or only partially segmented resulted in the most visually appealing results.

I will close this for now but you can re-open if you have further questions.

pc4653 commented 4 years ago

Hi Prof. Unberath,

Another question I have when I am attempting to understand DeepDRR is does projector function necessarily needs the information about the segmentation map?

According to the cuda code, it seems that the density and segmentation maps are multiplied during calculation of the light path (while interpolated individually). Wouldn't a quicker way be to do this through multiplying the density volume and the segmentation beforehand and give the projector one volume to deal with?

Thanks! Cheng

mathiasunberath commented 4 years ago

The simulation of image formation does not stop after projection. Multiplication with segmentation masks allows us to compute material-dependent density along any given ray. These are then weighted in the projector with the material- and energy-dependent attenuation values and the energy density of the selected X-ray source spectrum. You could not do this if you only computed one projection as you suggest.

pc4653 commented 4 years ago

I think perhaps I did not explained myself clearly, I did not mean computing only one projection. What I thought is that we can multiple the [air, soft tissue, bone] masks with the density volume to have three volumes, say [air_vol, soft_tissue_vol, bone_vol]. Then we calculation the forward projection on each of the volume to get the material-dependent density along the given ray. This way we do not need to interpolate both the volume and the masks three times during each calculation, just three masked-out volumes each time.

There are more to do after the projections, but from what I understand, downstream functions like mass_attenuation.calculate_intensity_from_spectrum() that calculates the material- and energy-dependent attenuation values are separate from the actual forward projection operations. Is there something that necessitates the use of all density volume during each forward projection operations?

Thanks! Cheng

mathiasunberath commented 4 years ago

I see. So, I guess the reasoning behind this is that is you do this consecutively you will have to transfer three volumes and outputs back and forth to GPU. Usually, this transfer is what gives you the most overhead and drop in performance. If you now generate, say, 10k projections and for every one of them you need to up- and unload volumes, you will accumulate quite some overhead. The way it's currently implemented (IIRC), the volume and masks can remain on the GPU as the pose parameters are queried so it seems more efficient. This being said, we have not benchmarked it.

pc4653 commented 4 years ago

Hi Prof. Unberath,

Sorry to bother again but I have a conceptual problem about the relationship between CT and X-ray and would like to ask you for better understanding if you have time:

In the DeepDRR paper you stated that regular DRR is generated based on some unrealistic assumptions ("only considers a single material in the mono-energetic case"). I am attempting to understand why the HU values in CT are generated with those assumptions when both regular Xray and CT imaging uses (presumptuously) similar X-ray emitter - in another word, the sinogram collected by CT detector should reflect X-ray attenuation in the same way the regular 2D X-ray image does.

I have not been able to find things that clearly explain this issue, but my best guess is that the mismatch between HU values in CT and attenuation values in 2D X-ray is due to the backprojection algorithm, which made the assumption of considering a single material in the mono-energetic case. As the result when we perform forward projection such assumption follows. Following this line of thinking, if we have the raw sinogram data from CT, we can actually construct a much better 2D X-ray estimation (up to some interpolation errors) from it.

Is this the correct interpretation?

Thank you so much for you help! Cheng

mathiasunberath commented 4 years ago

Hi Cheng.

Indeed, real measurements will not suffer from these problems. X-ray sources are (other than highly specialized devices) polychromatic and the imaged objects consist of multiple materials. However, the argument we make in the paper is NOT about digital radiography (i.e. X-ray imaging) in reality, it is about the simulation thereof: digitally reconstructed radiographs. Usually, they are achieved by simple integration over HU/density values. Clearly, there is no notion of poly-chromaticity nor material dependent attenuation. These are things that need to be explicitly accounted for, and DeepDRR tries to do so directly from routine CT scans.

Certainly, if you had a sinogram of the CT scan, you could "rebin" it in certain ways which would give you real measurements in different geometry. However, you will likely find that you will not have measured the rays you would need for interpolation-free rebinning, and you will also find that interpolation is complicated (impossible even) due to parallax. Finally, the whole point about realistic simulation is the prospect of simulating surgery and procedures in silico, specifically for novel approaches were real data is hard (or impossible) to acquire and collect.

ackbar03 commented 3 years ago

Hi Prof.

Sorry for commenting on an old issue but just hoping to confirm. My interpretation is that both the less dense trabecular bone and the denser cortical bone should both be identified as "bone" material despite the trabecular bone having density in the range of soft tissue because this affects how the X-ray is projected during simulation. Is this interpretation correct?

Thanks, Michael

mathiasunberath commented 3 years ago

Theoretically, yes. Practically, it depends. This is because of partial volume effects you'd likely overestimate the presence of bone in the trabeculae which would result in very strong attenuation. In addition, the trabecular bone intensity can be very close to muscle and other tissue and you would not want to include that in bone segmentation. So you may be dealing with a compromise.