mlmed / torchxrayvision

TorchXRayVision: A library of chest X-ray datasets and models. Classifiers, segmentation, and autoencoders.
https://mlmed.org/torchxrayvision
Apache License 2.0
921 stars 217 forks source link

Model not classifying obvious pneumothorax #81

Open AceMcAwesome77 opened 2 years ago

AceMcAwesome77 commented 2 years ago

Hi, I am trying to run a quick test using this model. I have 2 different chest XR images that show an entire lung collapsed from pneumothorax, and I'd like to verify that the model correctly picks them up. However it isn't. My starting image is "img_data" which is a numpy.ndarray of size (2800, 3408) that looks like this:

image

below is the code I'm running:

import torchxrayvision as xrv import skimage import torchvision import torch

model = xrv.models.ResNet(weights="resnet50-res512-all")

img = xrv.datasets.normalize(img_data, 255)

if len(img.shape) > 2: img = img[:, :, 0] if len(img.shape) < 2: print("error, dimension lower than 2 for image")

img = img[None, :, :]

transform = torchvision.transforms.Compose([xrv.datasets.XRayCenterCrop(), xrv.datasets.XRayResizer(512)])

img = transform(img)

output = {} with torch.no_grad(): <tab img = torch.from_numpy(img).unsqueeze(0) <tab preds = model(img).cpu() <tab output["preds"] = dict(zip(xrv.datasets.default_pathologies,preds[0].detach().numpy()))

The "<tab" is indented lines in the loop. Running all this code, I get the following "output" variable:

{'preds': {'Atelectasis': 0.031930413, 'Consolidation': 0.0079838885, 'Infiltration': 0.022067936, 'Pneumothorax': 0.012027948, 'Edema': 3.992413e-06, 'Emphysema': 0.008683062, 'Fibrosis': 0.0037461556, 'Effusion': 0.012206978, 'Pneumonia': 0.005400587, 'Pleural_Thickening': 0.043657843, 'Cardiomegaly': 0.0010988085, 'Nodule': 0.011990261, 'Mass': 0.20278542, 'Hernia': 1.3901392e-05, 'Lung Lesion': 0.5, 'Fracture': 0.033246215, 'Lung Opacity': 0.04536338, 'Enlarged Cardiomediastinum': 0.5}}

Where we can see that Pneumothorax has a score of 0.012. It should be much higher given the obvious pneumothorax. The other test image does the same thing, shows an obvious pneumothorax but scores about 0.01 using this pipeline. What am I doing wrong here? Thanks much!

ieee8023 commented 2 years ago

Hey sorry for my delay in getting back to you. Your code looks correct. I processed the image you posted using this script: https://github.com/mlmed/torchxrayvision/blob/master/scripts/process_image.py It seems the densenet at a 224x224 resolution predicts higher but it could just be predicting using some spuriously correlated signal.

Perhaps a pneumothorax that big was rare in the training data so the model didn't learn any features for it. These models are not perfect.

$ python3 process_image.py test-pneumo.png -weights densenet121-res224-all
Warning: Input size (252x252) is not the native resolution (224x224) for this model. A resize will be performed but this could impact performance.
{'preds': {'Atelectasis': 0.2336309,
           'Cardiomegaly': 0.5088244,
           'Consolidation': 0.49505916,
           'Edema': 0.006818563,
           'Effusion': 0.16614664,
           'Emphysema': 0.5041779,
           'Enlarged Cardiomediastinum': 0.44423354,
           'Fibrosis': 0.06360667,
           'Fracture': 0.50422204,
           'Hernia': 0.38043112,
           'Infiltration': 0.19527479,
           'Lung Lesion': 0.04211408,
           'Lung Opacity': 0.16715923,
           'Mass': 0.5078881,
           'Nodule': 0.22657393,
           'Pleural_Thickening': 0.16447839,
           'Pneumonia': 0.12326898,
           'Pneumothorax': 0.50655806}}

$ python3 process_image.py test-pneumo.png -weights resnet50-res512-all
Warning: Input size (252x252) is not the native resolution (512x512) for this model. A resize will be performed but this could impact performance.
{'preds': {'Atelectasis': 0.07039986,
           'Cardiomegaly': 0.006025043,
           'Consolidation': 0.010440136,
           'Edema': 0.00018646289,
           'Effusion': 0.06766936,
           'Emphysema': 0.0020499881,
           'Enlarged Cardiomediastinum': 0.5,
           'Fibrosis': 0.0077288793,
           'Fracture': 0.01869749,
           'Hernia': 0.00023267824,
           'Infiltration': 0.03506834,
           'Lung Lesion': 0.5,
           'Lung Opacity': 0.007614809,
           'Mass': 0.042145472,
           'Nodule': 0.02053261,
           'Pleural_Thickening': 0.009434696,
           'Pneumonia': 0.004180993,
           'Pneumothorax': 0.0058075087}}
ieee8023 commented 2 years ago

I also took a look at what the 224x224 densenet model was looking at using the gifsplanation approach (https://arxiv.org/abs/2102.09475) using this code: https://colab.research.google.com/github/mlmed/gifsplanation/blob/main/demo.ipynb

The images are not great but they give an impression of what changes the model prediction and it it seems to be looking at the right thing from what I can see.

test-pneumo1

https://user-images.githubusercontent.com/446367/148627508-033668fe-1b58-42f5-9f41-8437554ac75a.mp4

AceMcAwesome77 commented 2 years ago

Thanks for the reply! The gifsplanation is super interesting, I'll take a look at that paper.

croraf commented 6 months ago

Explicit resize seems to be problematic #152 , perhaps it also messes up your case