Closed WGierke closed 6 years ago
confirmed
I did some checking... I may be wrong, but it looks like the specified slice is not valid.
The one we are running the test for is {'x': 367, 'y': 349, 'z': 72}
but the DICOM at this slice looks like a mess:
While when I run the test on {'x': 367, 'y': 349, 'z': 189}
it doesn't fail. Here is how the image looks at specified slice:
@WGierke, @reubano what do you think?
Hi Serhiy, can you check the slice indices, please? The z axis is the number of the slice starting at 0. This is because the image position does not always start at a fixed value. I'm taking this from the design doc. Also, the used Python library for the LIDC images interprets it like that. Thanks!
On Dec 29, 2017, 10:31 PM, at 10:31 PM, Serhiy-Shekhovtsov notifications@github.com wrote:
I did some checking... I may be wrong, but it looks like the specified slice is not valid. The one we are running the test for is
{'x': 367, 'y': 349, 'z': 72}
but the DICOM at this slice looks like a mess:While when I run the test on
{'x': 367, 'y': 349, 'z': 189}
it doesn't fail. Here is how the image looks at specified slice:@WGierke, @reubano what do you think?
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/concept-to-clinic/concept-to-clinic/issues/268#issuecomment-354502724
I knew I missed something. Will keep checking this issue.
The mentioned issue is very simple - test is running on small DICOM image and 72 is far outside the boundaries of Z dimension(0, 28). The cropping will fail to create a patch of expected size (96, 96, 96). The test can be fixed by simple replacement of dicom_path_003
(small image) with dicom_paths[2]
(full image).
While troubleshooting this issue I have found an other problem - the preprocessing is zooming the image. But coordinates we are using for prediction are not scaled according to the zoom and the actual patch we are making prediction on differs from the expected patch. For example, the zooming factor for full LIDC-IDRI-0003 image is [2.5, 0.820312, 0.820312]. The prediction result for coordinates of the real nodule {'x': 367, 'y': 350, 'z': 72} is 0.0066. As you can see, it's very low. But if you scale these coordinates by the current zooming factor ({'x': 301, 'y': 286, 'z': 180}) we will get much higher probability of concerning 0.42.
I am looking for a good way to fix it, but if you have some ideas - feel free to share :)
Probably, zooming factor from preprocess method should be saved as a property of CT MetaData object and then used by crop_patch method instead of spacing? @WGierke, @reubano
@vessemer , I think you may also be interested in this discussion.
The mentioned issue is very simple - test is running on small DICOM image and 72 is far outside the boundaries of Z dimension(0, 28). The cropping will fail to create a patch of expected size (96, 96, 96). The test can be fixed by simple replacement of dicom_path_003(small image) with dicom_paths[2](full image).
Is there instead, another coordinate that works for both large and small images?
Is there instead, another coordinate that works for both large and small images?
Yes, there is. Don't have it at my hand right now, but will post within an hour. Actually, anything between 0 and 28 should be fine.
Probably, zooming factor from preprocess method should be saved as a property of CT MetaData object and then used by crop_patch method instead of spacing?
Hmm, self.spacing
is used by two different classes. If they are infact referencing the same thing, we should rethink how the code is organized. Maybe having MetaData
also subclass Params
?
But if you scale these coordinates by the current zooming factor ({'x': 301, 'y': 286, 'z': 180}) we will get much higher probability of concerning 0.42.
My suggestion would be to wrap whatever method you used to calculate those new coordinates into a function. Then add that new function into the conditional so that after zooming, it recalculates the coords.
@reubano, @WGierke something like this?
@vessemer the problem we had, the zooming factor got lost after zooming. When params.spacing
is 1 and the meta.spacing
is 2.5 we will zoom the image and then wipe out the meta.spacing
. Then we will don't know how to zoom the coordinates. Also I noticed that there is no use for params.spacing
, so I converted it boolean. When it's true
we will zoom the image during preprocessing according to meta.spacing
but we will also keep that value for future coordinates rescaling.
@Serhiy-Shekhovtsov, Noup, the point is that we don't need to store zooming factor, cause it computed through meta.spacing
and params.spacing
here. Then meta.spacing
updates here. Thus position of a candidate given in mm should be obtained via meta.spacing
. Also, one CT scan may be zoomed several times only carry the final spacing not the sequence of zooming factors.
I've already described where params.spacing
are used, and taking into account that preprocessing differs for different approaches then params.spacing
should be either float or tuple of floats.
In current implementation the preprocessing is overriding the meta.spacing
making it's impossible to scale the coordinates. I am giving you the real example - when params.spacing
is 1 and the meta.spacing
is 2.5 we will zoom the image and then set meta.spacing = 1
. The 2.5 value is lost after that and we cannot scale coordinates.
Also, I can't see any usage for params.spacing
except for this one place, where it is 1, so it doesn't affect the scaling ratio.
If you store coordinates in mm, then you don't need to keep old meta.spacing
values. It is correct, that params.spacing
usage is currently limited to the one case only, nonetheless, I do not see a point of the restriction you proposed for the params.spacing
variable.
Coordinates stored in in nonscaled form. So we have to scale them. And for that we need to know the spacing. So the point of suggested change is - preserving properties of meta
object.
@reubano this issue is solved.
@Serhiy-Shekhovtsov still failing for me on master
$ sudo docker-compose -f local.yml run prediction pytest -vrsk src/tests/test_classification.py
Starting base ...
Starting base ... done
======================================== test session starts =========================================
platform linux -- Python 3.6.3, pytest-3.1.3, py-1.5.2, pluggy-0.4.0 -- /usr/bin/python3.6
cachedir: .cache
rootdir: /app, inifile:
collected 6 items
src/tests/test_classification.py::test_classify_predict_load PASSED
src/tests/test_classification.py::test_classify_dicom PASSED
src/tests/test_classification.py::test_classify_real_nodule_small_dicom PASSED
src/tests/test_classification.py::test_classify_dicom_nodule FAILED
src/tests/test_classification.py::test_classify_real_nodule_full_dicom PASSED
src/tests/test_classification.py::test_classify_luna PASSED
============================================== FAILURES ==============================================
_____________________________________ test_classify_dicom_nodule _____________________________________
dicom_path_003 = '/images/LIDC-IDRI-0003/1.3.6.1.4.1.14519.5.2.1.6279.6001.101370605276577556143013894866/1.3.6.1.4.1.14519.5.2.1.6279.6001.170706757615202213033480003264'
model_path = '/app/src/algorithms/classify/assets/gtr123_model.ckpt'
def test_classify_dicom_nodule(dicom_path_003, model_path):
> predicted = trained_model.predict(dicom_path_003, [{'x': 367, 'y': 349, 'z': 72}], model_path)
src/tests/test_classification.py:21:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
src/algorithms/classify/trained_model.py:40: in predict
return gtr123_model.predict(dicom_path, centroids, model_path)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
ct_path = '/images/LIDC-IDRI-0003/1.3.6.1.4.1.14519.5.2.1.6279.6001.101370605276577556143013894866/1.3.6.1.4.1.14519.5.2.1.6279.6001.170706757615202213033480003264'
nodule_list = [{'x': 367, 'y': 349, 'z': 72}]
model_path = '/app/src/algorithms/classify/assets/gtr123_model.ckpt'
def predict(ct_path, nodule_list, model_path=None):
"""
Args:
ct_path (str): path to a MetaImage or DICOM data.
nodule_list: List of nodules
model_path: Path to the torch model (Default value = "src/algorithms/classify/assets/gtr123_model.ckpt")
Returns:
List of nodules, and probabilities
"""
if not model_path:
CLASSIFY_DIR = path.join(Config.ALGOS_DIR, 'classify')
model_path = path.join(CLASSIFY_DIR, 'assets', 'gtr123_model.ckpt')
if not nodule_list:
return []
casenet = CaseNet()
casenet.load_state_dict(torch.load(model_path))
casenet.eval()
if torch.cuda.is_available():
casenet = torch.nn.DataParallel(casenet).cuda()
# else:
# casenet = torch.nn.parallel.DistributedDataParallel(casenet)
preprocess = PreprocessCT(clip_lower=-1200., clip_upper=600., spacing=True, order=1,
min_max_normalize=True, scale=255, dtype='uint8')
# convert the image to voxels(apply the real spacing between pixels)
ct_array, meta = preprocess(*load_ct(ct_path))
patches = patches_from_ct(ct_array, meta, config['crop_size'], nodule_list,
stride=config['stride'], pad_value=config['filling_value'])
results = []
for nodule, (cropped_image, coords) in zip(nodule_list, patches):
> cropped_image = Variable(torch.from_numpy(cropped_image[np.newaxis, np.newaxis]).float())
E RuntimeError: the given numpy array has zero-sized dimensions. Zero-sized dimensions are not supported in PyTorch
src/algorithms/classify/src/gtr123_model.py:273: RuntimeError
================================ 1 failed, 5 passed in 39.14 seconds =================================
@reubano I can't reproduce it on latest version. Can you check it again, please?
@Serhiy-Shekhovtsov did you add the new test_classify_dicom_nodule
test from the included patch file?
@reubano the patch is, actually, invalid:
test is running on small DICOM image and 72 is far outside the boundaries of Z dimension(0, 28). The cropping will fail to create a patch of expected size (96, 96, 96). The test can be fixed by simple replacement of
dicom_path_003
(small image) withdicom_paths[2]
(full image).
So, I have fixed that test by creating two new tests in this commit.
@Serhiy-Shekhovtsov that's right! You did mention that before. I'll edit the original issue so others don't fall back into the same trap. Thanks!
Related to #229 : While the nodule classification works for providing nodule locations with small values in each dimension, it fails when I tried to classify a real nodule (first one of patient LIDC-IDRI-0003).
Expected Behavior
A cancer probability should be returned.
Current Behavior
An error is thrown:
See here to verify that at this position there is a nodule:
Steps to Reproduce
read https://github.com/concept-to-clinic/concept-to-clinic/issues/268#issuecomment-359070245 first
patch.txt
Possible Solution
I guess it has something to do with the input size the gtr123 model expects.
Checklist before submitting
local.yml
configuration and ensured that I built the containers again and they reflect the most recent version of the project at theHEAD
commit on themaster
branch