cjacobs1 commented 5 years ago

Extracting all the bounding boxes for the nodules as Findings in XML format similar to Arnaud's nodule classification processor. We will extract:

[x] the center (world coordinates) and extent (full in mm) of the bounding box per nodule
[x] the probability score per nodule (probability of it being a nodule)
[x] a cancer probability score per nodule (at least for the top5 entries)
[x] the list will be sorted by confidence score from highest confidence to lowest.
[x] An image-level cancer probability score based on the top5 entries.
[x] Test if it is possible to generate a cancer probability score per nodule beyond the top-5
~~Sufficient information should be available to compute Diameter_mm and Volume_mm3 as well, will we compute and include those?~~ No
[x] Output more than 5 nodules / image (Feature)
[x] Add proper tests
- [x] Test top5 based image-level cancer probability is still consistent with only processing 5 nodules
- [x] Test that less than 5 and 0 nodules also work (add to test cases)
- [x] Test that the relative to world coordinate conversion works fine. (Sanity check)
- [x] Test output of generated XML files and compare them with the old result files. (Sanity check)
- [x] Testing with different transform matrices
- [x] Reasserting if all tests work now (takes a while)

Proposal output finding XML format:

<LungCADReport>
<LungCAD>...</LungCAD>
<ImageInfo>...</ImageInfo>
<CancerInfo>
  <CaseCancerProbability>0.89</CaseCancerProbability>
  <ReferenceNoduleIDs>1,2,3,4,5</ReferenceNoduleIDs>
</CancerInfo>
<Finding>
  <ID>0</ID>
  <X>-29.6</X>
  <Y>81.62</Y>
  <Z>1800.24</Z>
  <Extent>
    <ExtentX>-29.6</ExtentX>
    <ExtentY>81.62</ExtentY>
    <ExtentZ>1800.24</ExtentZ>
  </Extent>
  <Probability>0.9</Probability>
  <CancerProbability>1.91551e-10</CancerProbability>
  <Diameter_mm>-1.0</Diameter_mm>
  <Volume_mm3>-1.0</Volume_mm3>
</Finding>
</LungCADReport>

silvandeleemput commented 5 years ago

This issue seems closely related to #3 which mentioned the following concern:

We need to find out where the filtering of the nodules happens if there are less than 5 candidate boxes outputted because then, the probabilities and nodule outputs do not match

Originally posted by @cjacobs1 in https://github.com/DIAGNijmegen/bodyct-kaggle-grt123/issues/3#issuecomment-492223009

@cjacobs1 I think I found the location where the filtering (selecting the top5 candidate boxes) happens. This implies that I should be able to output at least all nodule candidates. However, the ranking is based on a confidence value which is not a probability (can be a value > 1 and maybe even < 0). There is of course also the Jaccard/IoU associated with each nodule bounding box, which is probably used for thresholding the nodules into Y/N/ignore categories.

I would like to discuss what to extract besides the center and extent of the bounding box. Are you interested in the Jaccard/IoU values only or also the confidence values?

Furthermore, what output format do we want: an XML file similar to Arnaud's nodule classification processor, or do we want the format already there and extend that with the currently missing information?

silvandeleemput commented 5 years ago

After discussion we agreed on serveral things. The decision have been moved to the top of the ticket.

cjacobs1 commented 5 years ago

[ ] An image-level cancer probability score based on the top5 entries. (@cjacobs1 where do we put this in the output XML report? A separate finding or in the ImageInfo group? )

Let's make a separate group for that, called CancerInfo. Also, please change Confidence into Probability. So, it will look like:

<ImageInfo>
  <all image info>
</ImageInfo>
<CancerInfo>
  <CaseCancerProbability>0.89</CaseCancerProbability>
</CancerInfo>
<Findings>
  <Finding>
    <ID>0</ID>
    <X>-29.6</X>
    <Y>81.62</Y>
    <Z>1800.24</Z>
    <Extent>
      <ExtentX>-29.6</ExtentX>
      <ExtentY>81.62</ExtentY>
      <ExtentZ>1800.24</ExtentZ>
    </Extent>
    <Probability>12.4</Probability>
    <CancerProbability>1.91551e-10</CancerProbability>
    <Diameter_mm>-1.0</Diameter_mm>
    <Volume_mm3>-1.0</Volume_mm3>
  </Finding>
</Findings>

[ ] Test if it is possible to generate a cancer probability score per nodule beyond the top-5

[ ] @cjacobs1 Sufficient information should be available to compute Diameter_mm and Volume_mm3 as well, will we compute and include those?

Not needed at this point, the segmentation performance is unclear so not sure how useful it is.

silvandeleemput commented 5 years ago

@cjacobs1 You were right, the nodule probability scores are indeed computed by applying the sigmoid function, so I propose to use those instead of the confidence scores directly.

silvandeleemput commented 5 years ago

@cjacobs1 I have updated the proposed output format, based on your feedback. The CancerInfo group is now included and I have changed Confidence to Probability (which now actually also can be a proper probability through the sigmoid) This should be sufficient to get going.

Do we want to reference the relevant nodule findings linked to the CaseCancerProbability?

cjacobs1 commented 5 years ago

@cjacobs1 You were right, the nodule probability scores are indeed computed by applying the sigmoid function, so I propose to use those instead of the confidence scores directly.

Perfect, agreed!

cjacobs1 commented 5 years ago

@cjacobs1 I have updated the proposed output format, based on your feedback. The CancerInfo group is now included and I have changed Confidence to Probability (which now actually also can be a proper probability through the sigmoid) This should be sufficient to get going.

Do we want to reference the relevant nodule findings linked to the CaseCancerProbability?

Yes, we do. Perhaps best to include the IDs of those nodules in the CancerInfo part. So, it would look like:

<CancerInfo>
  <CaseCancerProbability>0.89</CaseCancerProbability>
  <ReferenceNoduleIDs>1,2,3,4,5</ReferenceNoduleIDs>
</CancerInfo>

But if we sort them, it will always be 1,2,3,4,5, right? But still good to add, I think. For someone who is not so familiar with the algorithm.

silvandeleemput commented 5 years ago

But if we sort them, it will always be 1,2,3,4,5, right? But still good to add, I think. For someone who is not so familiar with the algorithm.

My thought exactly. I'll add it to the decision.

silvandeleemput commented 5 years ago

I am making good progress on this. There are just a few things left to do before this can be put into a PR:

Moved TODO to the top of the ticket.

silvandeleemput commented 5 years ago

General update

A full XML report will be generated per input image to the specified output directory. This report contains:
- an image level cancer score and with the referenced nodules which the score is based on.
- a finding per identified nodule (this can now exceed the 5 nodules limit)
  - probability of being a nodule
  - probability of being a cancer
  - nodule center in world coordinates (mm)
  - nodule bbox extent x/y/z (full size in mm so length from a to b: a<--c-->b )
All other output reports .csv .json and such have been made optional, but are disabled by default
Preprocessing and nodule bbox intermediate files have been refactored out of the source files and go into outputdir/prep outputdir/bbox by default.
Cases which had 0 nodules would crash cause the processor to crash, but this has been fixed, it will output 0 findings and a cancerprobability score of 0 by default based on the formula in the grt123 arxiv paper.
Case with n nodules and n < 5 will give a score based on the available n nodules. the referenceids will be automatically adjusted accordingly, but will never use more than 5 nodules.

Unresolved/new issues:

Reporting

SeriesUID can be any string not only valid seriesuids. Currently this is set to the filename of the processed file - .mhd or .mha extensions. Throw a warning or ignore this issue?
Extent: full-extent vs half-extent in mm (now: full in mm)

Performance issues:

GPU processing seems to be enabled even if the number of GPUs is set to zero.
On whole case processing, cases with more than +/-30 nodules cause GPU OOM (6 GB VRAM tested) on cancer classification. This could potentially be processed in smaller batches to avoid OOM errors.
First performing detection and subsequently classification on many files sometimes results in OOM, while then running only the classification performs fine. This is probably because of unfreed resources and could potentially be fixed.
Preprocessing can be pretty fast in most of the cases, but it also takes 10-20 minutes/case in some instances.

Untested

If the conversion of world to voxel and voxel to world coordinates goes well.

silvandeleemput commented 5 years ago

Today, I was finally able to identify and resolve the issue with the conversion from voxel to world coordinates. The test MHD and MHA files had incorrect (mirrored) spacing and offset w.r.t. the Dicom files, also the ITK fileloader loaded the spacing and offset mirrored.

This led to the tests not failing on the test files and failing on new Dicom files. I have fixed the ITK loader and the MHD and MHA test files. In addition, I have added rigorous testing for the voxel to world coordinate conversions.

Some things remaining before the PR:

[x] Testing with different transform matrices
[x] Reasserting if all tests work now (takes a while)

haimasree commented 5 years ago

Does this mean it was not an issue with convert_voxel_to_world_coordinates.py, but the input test files? If indeed the above code is faulty, can I see line numbers which may be contributing to the failure? @silvandeleemput

silvandeleemput commented 5 years ago

Does this mean it was not an issue convert_voxel_to_world_coordinates.py, but the input test files?

@haimasree Yes, there was no fault in your WorldToVoxelConvert class. So far your module holds fine against my tests, but I'll still need to test transform matrices. The fault was indeed in the input test files, which probably also led to an incorrect implementation of the itk image fileloader. My apologies for the unwarranted callout earlier.

haimasree commented 5 years ago

Oh no worries at all. Like I said before, I never tested with mhd/mha files and fixed that fairly quickly which didnt allow me to do as much testing as I would have liked. Hence, I was curious. Good work!

DIAGNijmegen / bodyct-dsb2017-grt123

Output all nodule candidates with their respective nodulePred to separate file #7

General update

Unresolved/new issues:

Reporting

Performance issues:

Untested