drivendataorg / concept-to-clinic

ALCF Concept to Clinic Challenge
https://concepttoclinic.drivendata.org/
MIT License
367 stars 147 forks source link

Classification throws RuntimeError for real nodule location #268

Closed WGierke closed 6 years ago

WGierke commented 6 years ago

Related to #229 : While the nodule classification works for providing nodule locations with small values in each dimension, it fails when I tried to classify a real nodule (first one of patient LIDC-IDRI-0003).

Expected Behavior

A cancer probability should be returned.

Current Behavior

An error is thrown:

(venv3.5) ➜  concept-to-clinic git:(master) ✗ docker exec -it concepttoclinic_prediction_run_259  pytest -vrsk src/tests/test_classification.py
=========================================================================================== test session starts ============================================================================================
platform linux -- Python 3.6.1, pytest-3.1.3, py-1.4.34, pluggy-0.4.0 -- /usr/bin/python3.6
cachedir: .cache
rootdir: /app, inifile:
collected 4 items 

src/tests/test_classification.py::test_classify_predict_load PASSED
src/tests/test_classification.py::test_classify_dicom PASSED
src/tests/test_classification.py::test_classify_dicom_nodule FAILED
src/tests/test_classification.py::test_classify_luna PASSED

================================================================================================= FAILURES =================================================================================================
________________________________________________________________________________________ test_classify_dicom_nodule ________________________________________________________________________________________

dicom_path_003 = '/images/LIDC-IDRI-0003/1.3.6.1.4.1.14519.5.2.1.6279.6001.101370605276577556143013894866/1.3.6.1.4.1.14519.5.2.1.6279.6001.170706757615202213033480003264'
model_path = '/app/src/algorithms/classify/assets/gtr123_model.ckpt'

    def test_classify_dicom_nodule(dicom_path_003, model_path):
>       predicted = trained_model.predict(dicom_path_003, [{'x': 367, 'y': 350, 'z': 72}], model_path)

src/tests/test_classification.py:15: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
src/algorithms/classify/trained_model.py:40: in predict
    return gtr123_model.predict(dicom_path, centroids, model_path)
src/algorithms/classify/src/gtr123_model.py:275: in predict
    _, pred, _ = casenet(cropped_image, coords)
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py:224: in __call__
    result = self.forward(*input, **kwargs)
src/algorithms/classify/src/gtr123_model.py:215: in forward
    noduleFeat, nodulePred = self.NoduleNet(xlist, coordlist)
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py:224: in __call__
    result = self.forward(*input, **kwargs)
src/algorithms/classify/src/gtr123_model.py:175: in forward
    feat = self.back2(torch.cat((rev2, out2, coord), 1))  # 64+64
/usr/local/lib/python3.6/dist-packages/torch/autograd/variable.py:897: in cat
    return Concat.apply(dim, *iterable)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

ctx = <torch.autograd.function.ConcatBackward object at 0x7f6c231bbb88>, dim = 1
inputs = (
(0 ,0 ,0 ,.,.) = 
  0.0000  0.0000  0.0000  ...   0.0000  0.0000  0.0000
  0.8768  0.6747  1.5596  ...   0.5966  1.1....4682  0.4782  0.4881
  0.2595  0.2695  0.2794  ...   0.4682  0.4782  0.4881
[torch.FloatTensor of size 1x3x24x24x24]
)

    @staticmethod
    def forward(ctx, dim, *inputs):
        ctx.dim = dim
        ctx.input_sizes = [i.size(dim) for i in inputs]
>       return torch.cat(inputs, dim)
E       RuntimeError: inconsistent tensor sizes at /pytorch/torch/lib/TH/generic/THTensorMath.c:2709

/usr/local/lib/python3.6/dist-packages/torch/autograd/_functions/tensor.py:317: RuntimeError
------------------------------------------------------------------------------------------- Captured stdout call -------------------------------------------------------------------------------------------
torch.Size([1, 1, 52, 96, 96])
=================================================================================== 1 failed, 3 passed in 16.99 seconds ====================================================================================

See here to verify that at this position there is a nodule: screenshot from 2017-12-16 00-40-02

Steps to Reproduce

read https://github.com/concept-to-clinic/concept-to-clinic/issues/268#issuecomment-359070245 first

patch.txt

git checkout 42ba1b121504ad318210463d422f5ca2947cbacb
git apply patch.txt
docker-compose -f local.yml run prediction pytest -vrsk src/tests/test_classification.py

Possible Solution

I guess it has something to do with the input size the gtr123 model expects.

Checklist before submitting

reubano commented 6 years ago

confirmed

Serhiy-Shekhovtsov commented 6 years ago

I did some checking... I may be wrong, but it looks like the specified slice is not valid. The one we are running the test for is {'x': 367, 'y': 349, 'z': 72} but the DICOM at this slice looks like a mess: nodule1

While when I run the test on {'x': 367, 'y': 349, 'z': 189} it doesn't fail. Here is how the image looks at specified slice: nodule1

@WGierke, @reubano what do you think?

WGierke commented 6 years ago

Hi Serhiy, can you check the slice indices, please? The z axis is the number of the slice starting at 0. This is because the image position does not always start at a fixed value. I'm taking this from the design doc. Also, the used Python library for the LIDC images interprets it like that. Thanks!

On Dec 29, 2017, 10:31 PM, at 10:31 PM, Serhiy-Shekhovtsov notifications@github.com wrote:

I did some checking... I may be wrong, but it looks like the specified slice is not valid. The one we are running the test for is {'x': 367, 'y': 349, 'z': 72} but the DICOM at this slice looks like a mess: nodule1

While when I run the test on {'x': 367, 'y': 349, 'z': 189} it doesn't fail. Here is how the image looks at specified slice: nodule1

@WGierke, @reubano what do you think?

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/concept-to-clinic/concept-to-clinic/issues/268#issuecomment-354502724

Serhiy-Shekhovtsov commented 6 years ago

I knew I missed something. Will keep checking this issue.

Serhiy-Shekhovtsov commented 6 years ago

The mentioned issue is very simple - test is running on small DICOM image and 72 is far outside the boundaries of Z dimension(0, 28). The cropping will fail to create a patch of expected size (96, 96, 96). The test can be fixed by simple replacement of dicom_path_003(small image) with dicom_paths[2](full image).

While troubleshooting this issue I have found an other problem - the preprocessing is zooming the image. But coordinates we are using for prediction are not scaled according to the zoom and the actual patch we are making prediction on differs from the expected patch. For example, the zooming factor for full LIDC-IDRI-0003 image is [2.5, 0.820312, 0.820312]. The prediction result for coordinates of the real nodule {'x': 367, 'y': 350, 'z': 72} is 0.0066. As you can see, it's very low. But if you scale these coordinates by the current zooming factor ({'x': 301, 'y': 286, 'z': 180}) we will get much higher probability of concerning 0.42.

I am looking for a good way to fix it, but if you have some ideas - feel free to share :)

Serhiy-Shekhovtsov commented 6 years ago

Probably, zooming factor from preprocess method should be saved as a property of CT MetaData object and then used by crop_patch method instead of spacing? @WGierke, @reubano

Serhiy-Shekhovtsov commented 6 years ago

@vessemer , I think you may also be interested in this discussion.

reubano commented 6 years ago

The mentioned issue is very simple - test is running on small DICOM image and 72 is far outside the boundaries of Z dimension(0, 28). The cropping will fail to create a patch of expected size (96, 96, 96). The test can be fixed by simple replacement of dicom_path_003(small image) with dicom_paths[2](full image).

Is there instead, another coordinate that works for both large and small images?

Serhiy-Shekhovtsov commented 6 years ago

Is there instead, another coordinate that works for both large and small images?

Yes, there is. Don't have it at my hand right now, but will post within an hour. Actually, anything between 0 and 28 should be fine.

reubano commented 6 years ago

Probably, zooming factor from preprocess method should be saved as a property of CT MetaData object and then used by crop_patch method instead of spacing?

Hmm, self.spacing is used by two different classes. If they are infact referencing the same thing, we should rethink how the code is organized. Maybe having MetaData also subclass Params?

reubano commented 6 years ago

But if you scale these coordinates by the current zooming factor ({'x': 301, 'y': 286, 'z': 180}) we will get much higher probability of concerning 0.42.

My suggestion would be to wrap whatever method you used to calculate those new coordinates into a function. Then add that new function into the conditional so that after zooming, it recalculates the coords.

Serhiy-Shekhovtsov commented 6 years ago

@reubano, @WGierke something like this?

vessemer commented 6 years ago

@Serhiy-Shekhovtsov, thanks for mentioning me, if I correct, then the input coordinates should always be in real units — mm, not in voxels, and for this purpose, we always store current spacing in meta which updates with zoom.

Serhiy-Shekhovtsov commented 6 years ago

@vessemer the problem we had, the zooming factor got lost after zooming. When params.spacing is 1 and the meta.spacing is 2.5 we will zoom the image and then wipe out the meta.spacing. Then we will don't know how to zoom the coordinates. Also I noticed that there is no use for params.spacing, so I converted it boolean. When it's true we will zoom the image during preprocessing according to meta.spacing but we will also keep that value for future coordinates rescaling.

vessemer commented 6 years ago

@Serhiy-Shekhovtsov, Noup, the point is that we don't need to store zooming factor, cause it computed through meta.spacing and params.spacing here. Then meta.spacing updates here. Thus position of a candidate given in mm should be obtained via meta.spacing. Also, one CT scan may be zoomed several times only carry the final spacing not the sequence of zooming factors. I've already described where params.spacing are used, and taking into account that preprocessing differs for different approaches then params.spacing should be either float or tuple of floats.

Serhiy-Shekhovtsov commented 6 years ago

In current implementation the preprocessing is overriding the meta.spacing making it's impossible to scale the coordinates. I am giving you the real example - when params.spacing is 1 and the meta.spacing is 2.5 we will zoom the image and then set meta.spacing = 1. The 2.5 value is lost after that and we cannot scale coordinates. Also, I can't see any usage for params.spacing except for this one place, where it is 1, so it doesn't affect the scaling ratio.

vessemer commented 6 years ago

If you store coordinates in mm, then you don't need to keep old meta.spacing values. It is correct, that params.spacing usage is currently limited to the one case only, nonetheless, I do not see a point of the restriction you proposed for the params.spacing variable.

Serhiy-Shekhovtsov commented 6 years ago

Coordinates stored in in nonscaled form. So we have to scale them. And for that we need to know the spacing. So the point of suggested change is - preserving properties of meta object.

Serhiy-Shekhovtsov commented 6 years ago

@reubano this issue is solved.

reubano commented 6 years ago

@Serhiy-Shekhovtsov still failing for me on master

$ sudo docker-compose -f local.yml run prediction pytest -vrsk src/tests/test_classification.py
Starting base ... 
Starting base ... done
======================================== test session starts =========================================
platform linux -- Python 3.6.3, pytest-3.1.3, py-1.5.2, pluggy-0.4.0 -- /usr/bin/python3.6
cachedir: .cache
rootdir: /app, inifile:
collected 6 items 

src/tests/test_classification.py::test_classify_predict_load PASSED
src/tests/test_classification.py::test_classify_dicom PASSED
src/tests/test_classification.py::test_classify_real_nodule_small_dicom PASSED
src/tests/test_classification.py::test_classify_dicom_nodule FAILED
src/tests/test_classification.py::test_classify_real_nodule_full_dicom PASSED
src/tests/test_classification.py::test_classify_luna PASSED

============================================== FAILURES ==============================================
_____________________________________ test_classify_dicom_nodule _____________________________________

dicom_path_003 = '/images/LIDC-IDRI-0003/1.3.6.1.4.1.14519.5.2.1.6279.6001.101370605276577556143013894866/1.3.6.1.4.1.14519.5.2.1.6279.6001.170706757615202213033480003264'
model_path = '/app/src/algorithms/classify/assets/gtr123_model.ckpt'

    def test_classify_dicom_nodule(dicom_path_003, model_path):
>       predicted = trained_model.predict(dicom_path_003, [{'x': 367, 'y': 349, 'z': 72}], model_path)

src/tests/test_classification.py:21: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
src/algorithms/classify/trained_model.py:40: in predict
    return gtr123_model.predict(dicom_path, centroids, model_path)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

ct_path = '/images/LIDC-IDRI-0003/1.3.6.1.4.1.14519.5.2.1.6279.6001.101370605276577556143013894866/1.3.6.1.4.1.14519.5.2.1.6279.6001.170706757615202213033480003264'
nodule_list = [{'x': 367, 'y': 349, 'z': 72}]
model_path = '/app/src/algorithms/classify/assets/gtr123_model.ckpt'

    def predict(ct_path, nodule_list, model_path=None):
        """

        Args:
          ct_path (str): path to a MetaImage or DICOM data.
          nodule_list: List of nodules
          model_path: Path to the torch model (Default value = "src/algorithms/classify/assets/gtr123_model.ckpt")

        Returns:
          List of nodules, and probabilities

        """
        if not model_path:
            CLASSIFY_DIR = path.join(Config.ALGOS_DIR, 'classify')
            model_path = path.join(CLASSIFY_DIR, 'assets', 'gtr123_model.ckpt')

        if not nodule_list:
            return []

        casenet = CaseNet()
        casenet.load_state_dict(torch.load(model_path))
        casenet.eval()

        if torch.cuda.is_available():
            casenet = torch.nn.DataParallel(casenet).cuda()
        # else:
        #     casenet = torch.nn.parallel.DistributedDataParallel(casenet)

        preprocess = PreprocessCT(clip_lower=-1200., clip_upper=600., spacing=True, order=1,
                                  min_max_normalize=True, scale=255, dtype='uint8')

        # convert the image to voxels(apply the real spacing between pixels)
        ct_array, meta = preprocess(*load_ct(ct_path))

        patches = patches_from_ct(ct_array, meta, config['crop_size'], nodule_list,
                                  stride=config['stride'], pad_value=config['filling_value'])

        results = []

        for nodule, (cropped_image, coords) in zip(nodule_list, patches):
>           cropped_image = Variable(torch.from_numpy(cropped_image[np.newaxis, np.newaxis]).float())
E           RuntimeError: the given numpy array has zero-sized dimensions. Zero-sized dimensions are not supported in PyTorch

src/algorithms/classify/src/gtr123_model.py:273: RuntimeError
================================ 1 failed, 5 passed in 39.14 seconds =================================
Serhiy-Shekhovtsov commented 6 years ago

@reubano I can't reproduce it on latest version. Can you check it again, please?

reubano commented 6 years ago

@Serhiy-Shekhovtsov did you add the new test_classify_dicom_nodule test from the included patch file?

Serhiy-Shekhovtsov commented 6 years ago

@reubano the patch is, actually, invalid:

test is running on small DICOM image and 72 is far outside the boundaries of Z dimension(0, 28). The cropping will fail to create a patch of expected size (96, 96, 96). The test can be fixed by simple replacement of dicom_path_003(small image) with dicom_paths[2](full image).

So, I have fixed that test by creating two new tests in this commit.

reubano commented 6 years ago

@Serhiy-Shekhovtsov that's right! You did mention that before. I'll edit the original issue so others don't fall back into the same trap. Thanks!