sct-pipeline / contrast-agnostic-softseg-spinalcord

Contrast-agnostic spinal cord segmentation project with softseg
MIT License
4 stars 3 forks source link

What threshold to use on model prediction? #98

Closed jcohenadad closed 4 months ago

jcohenadad commented 6 months ago

The model generates predictions that are too soft. Some voxels are non-zero, even >5 voxels away from the cord. This clearly does not reflect partial volume, but is likely the result of excessive softness of our ground truth.

Some reports: https://docs.google.com/presentation/d/1cmYdhSQieDN7c2QTx6suroGeL7P-3zmkF_8N2KVX1q0/edit#slide=id.g2b082a87bce_9_7

naga-karthik commented 4 months ago

Since the issue is a bit old now, here's an update to summarize important updates:

I ran this script to understand the effect of different thresholds on the CSA using the soft_bin model and shown below is the STD of CSA across contrasts for each threshold. Each scatter point in the violin plot is one (test) subject’s average CSA across contrasts.

STD CSA across thresholds ![std_csa_threshold](https://github.com/sct-pipeline/contrast-agnostic-softseg-spinalcord/assets/53445351/81ef5570-e294-4892-bb61-346972ab064b)

Based on the plot, it seems that the STD is similar across thresholds when averaged across contrasts. Therefore, we can go with threshold=0.1 as it has the smallest mean STD (in text on top of violin) compared to other thresholds.

Important clarification: The threshold chosen here will only be used during inference (i.e. when the user requires soft prediction as the output). For future versions of the contrast-agnostic model, the input segmentations are binarized (0.5) soft segmentations, hence avoiding the need to worry about thresholds for training.

naga-karthik commented 4 months ago

closing as the following has been decided:

  1. If predictions are to be used for training the next version of the model --> binarize outputs with threshold=0.5 (because we want to input binarized input to the model
  2. If predictions are to be used just as (soft) outputs --> use threshold=0.1 (so that values < 0.1 are 0 and the rest are kept the same) but do not binarize (i.e. create 0/1 array)