MIC-DKFZ / nnDetection

nnDetection is a self-configuring framework for 3D (volumetric) medical object detection which can be applied to new data sets without manual intervention. It includes guides for 12 data sets that were used to develop and evaluate the performance of the proposed method.
Apache License 2.0
551 stars 97 forks source link

Relevance of masks in training a nnDetection model for nodule detection #211

Closed DSRajesh closed 9 months ago

DSRajesh commented 11 months ago

:question: Question

DSRajesh commented 11 months ago

Hello,

We have a bunch of datasets for training the nnDetection model for nodule detection. We have few questions regarding the nature of masks used in the training process:

  1. Among the bunch of datasets we are using, some of these have their nodule segmentation mask while others have only the nodule center and bounding box information. Can we approximate an ellipsoid mask that just fits the bounding box for this latter type?

  2. Since a trained nnDetection model uses bounding boxes around the detected nodule as the models output for a given input, will this heterogeneity of masks among the bunch of datasets affect the model training?

How important is the nature of the masks used during training?

Thanking you

Rajesh

mibaumgartner commented 11 months ago

Hello @DSRajesh ,

1) Yes, it is definitely possible to approximate the nodules with elipsoids to train nndetection and the results will still be suitable - this is also what we have done for LUNA16. In some instance the segmentations could overlap (since boxes can overlap too) which need a manual strategy to be resolved. There might some conflicting gradients when training with a mixture of exact segmentations and rough segmentation. This problem didn't come up for me yet so it is hard to comment on the exact performance degradation. I would experiment with different settings, e.g. train a model only on the subset with exact segmentation, train a model with a mixture of the annotations etc. and check their performance in the 5 Fold CV setting.

2) As I have described above, there is certainly some influence in the segmentation part of the model, and the multi-task learning might influence the detection performance as well. Also, there are differences with the augmentations; e.g. rotating an approximate structure and getting its axes aligned box introduces some errors in the training boxes. In case of exact segmentations, Retina U-Net outperformed its box-only counterpart (RetinaNet) consistently. Furthermore, there are multiple papers that used a segmentation approximation and showed improved results. I haven't seen literature where exact segmentations and approximations were mixed together.

Best, Michael

github-actions[bot] commented 9 months ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 9 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.