Dataset availability - Githubissues

FabianIsensee commented 3 years ago

Hi there, thank you so much for providing this tool! It is very easy and convenient to use. I experimented with your model for the covid challenge (https://covid-segmentation.grand-challenge.org/COVID-19-20/). I used the R231CovidWeb model. Unfortunately it was producing some errors, especially in regions where covid-induced alterations were close to the border of the lungs. It appears to me that you are using a 2D U-Net to segment the lungs. This may be a suboptimal design choice for 3D medical image data. Is the dataset you trained your model with publicly available? I would like to use it to train a 3D U-Net (with nnU-Net https://github.com/MIC-DKFZ/nnUNet) and see whether that improves the results on the CovidSeg challenge dataset. Best, Fabian

JoHof commented 3 years ago

Hi Fabian,

thanks for your feedback. You are right, 3D may bring some improvement and it was my plan to also train an nnU-Net which I think is superawesome! However, there are also some drawbacks as you will get less flexible with respect to slice distance and field of view. In the end, the 2d model worked so well that I never bothered with 3d. My goal was to make a tool that is as general as possible and for that it seemed simple 2d was the better fit. For lobe segmentation on the other hand I think 3d is key.

It was the plan to release the dataset and I hope it still is. I already left academia and don't have access to the data anymore as well. Someone has to get it through data clearing. I will follow up here when there are news.

My experience and also from others is, that the model is quite robust on covid cases. Could you let us know some particular cases in the covid challenge where the model failed?

FabianIsensee commented 3 years ago

Hi Johannes, thank you for your quick response! My original plan for the challenge was to use your lung segmentation as a mask for the CT images. That way my model would only have to concentrate on the Lung and does not have to deal with all the other stuff in the image. The problem is that on many cases doing that would have cut off some of the covid ground truth segmentation - and that is certainly something we do not want to do. To be fair, a lot of this cutting-off is due to the ground truth being rather inaccurate and sometimes extending beyond the lung. Bute there are also cases in which it looks like the segmentation of your model is incorrect. I have attached such a case here: volume-covid19-A-0003

Your lung segmentation is pink and yellow. The covid ground truth is red. Here is a coronal view of the same case:

It seems like there is some large lesion (or whatever) in the lung that is apparently part of the covid class but was not segmented as lung by your algorithm. I am not knowledgeable enough to say what exactly that is.

Note that this is the worst case I could find. Other cases arenot as severe: volume-covid19-A-0129:

volume-covid19-A-0570 (I omitted the covid label here because it was making things hard to see)

I think in those case a 3D U-Net could give more accurate results.

You mentioned that you went with your own implementation due to ease of use. nnU-Net now also offers easy sharing of weights and can be used with minimal effort to create projects such as this one here. See for example https://github.com/NeuroAI-HD/HD-GLIO

Best, Fabian

JoHof commented 3 years ago

I see, unfortunately there will always be cases for which it will yield suboptimal results. 3d would probably help in these cases. It also seems it suffers a bit from body kernels which were underrepresented in the training set. You could add false negatives that you know from the lesion ground truth and train an nnU-net on that data. That will probably work well on the validation set. Alternatively, you can check out the v0.3 branch and apply the R231lung1 model. This will be more robust towards very dense pathologies but will likely give you more false positives.

FabianIsensee commented 3 years ago

Hi, using the covid segmentations as additional lung label might work - but the result is still going to be imperfect. So that will probably not give much additional value relative to the segmentations I can already obtain from your method. I think I will just stick with lungmask as it is and use the lung segmentation to get a coarse bounding box. Is it possible that your code does some postprocessing on the segmentation masks? There are images in the challenge where the one image is stacked on top of itseld causing two thoraxes to be visible in the image. In these cases, you model will always only produce one segmentation (the second lung is not segmented). That sounds like all-but-largest-connected-component suppression. Is there an easy way to disable that? Best, Fabian

JoHof commented 3 years ago

Yes, it keeps only the largest components and does some holefilling. You can deactivate all of the postprocessing with the --popostprocess flag

FabianIsensee commented 3 years ago

thanks!

FabianIsensee commented 3 years ago

They updated the dataset and fixed those 'double' CT images. I should be able to use your tool our of the box to get a bbox for the lung. Thank you so much for your help and your explanations. Very much appreciated! Best, Fabian

JoHof / lungmask

Dataset availability #35