sct-pipeline / contrast-agnostic-softseg-spinalcord

Contrast-agnostic spinal cord segmentation project with softseg
MIT License
4 stars 3 forks source link

Naming convention for adding new contrasts to the training set #102

Open naga-karthik opened 4 months ago

naga-karthik commented 4 months ago

I am working on adding new datasets/contrasts to augment the existing spine-generic dataset. I wanted to clarify/confirm a few things about the naming conventions to be used for the outputs derived from the contrast agnostic model (which will be used for training)

Example folder structure for `basel-mp2rage` ``` basel-mp2rage ├── README.md ├── dataset_description.json ├── participants.tsv ├── participants.json ├── code/ ├── derivatives │ └── labels │ ├── dataset_description.json │ ├── README.md │ └── sub-CXXX │ └── anat │ ├── sub-CXXX_UNIT1_label-SC_seg.nii.gz │ ├── sub-CXXX_UNIT1_label-SC_seg.json │ └── labels_softseg_bin ---> this folder is newly added │ ├── dataset_description.json │ ├── README.md │ └── sub-CXXX │ └── anat │ ├── sub-CXXX_UNIT1_desc-softseg_label-SC_seg.nii.gz │ ├── sub-CXXX_UNIT1_desc-softseg_label-SC_seg.json ├── sub-CXXX │ └── anat │ ├──sub-CXXX_UNIT1.nii.gz │ └──sub-CXXX_UNIT1.json │ ```

Contents of the json file

{
    "Name": "contrast-agnostic model",
    "Version": "SCT v6.2" / "v2.0", # Should I use SCT version or the tag v2.0 of the contrast-agnositc repo?
    "Date": "2024-03-21"
}

Issue is that the model currently in SCT v6.2 is the original soft model trained on soft GTs. But, the model I will be use for inference and training with new contrasts is the soft_bin model.

EDIT: updated the filename for labels_softseg_bin

sandrinebedard commented 4 months ago

If it is under labels_softseg_bin, the name softseg is not the right one, you can check in spine generic data multi subject in my branch the name convention we decided

I would go with C-A version since the 2.0 is not in sct, is that right?

naga-karthik commented 4 months ago

Right, I checked and updated folder structure in my comment! The filenames are like this now:

sub-CXXX_UNIT1_desc-softseg_label-SC_seg.nii.gz

I would go with C-A version since the 2.0 is not in sct, is that right?

Correct, it's not in SCT. The 2.0 is essentially coming from our latest release. (but, technically, even this is not the soft_bin model, but I think it's okay, the difference with the original soft model isn't much.

sandrinebedard commented 4 months ago

Maybe we should create a release (like v.1?)

Nilser3 commented 4 months ago

For nih-ms-mp2rage I have generated these JSON files

{
  "GeneratedBy": [
    {
      "Name": "contrast-agnostic-softseg-spinalcord",
      "Version": "2.0"
    },
    {
      "Author": "Nilser Laines Medina",
      "Date": "2023-12-14",
      "Note": "Binarised spinal cord soft segmentation"
    }
  ]
}
naga-karthik commented 4 months ago

Thanks @Nilser3 for your JSON example! I think I will go with contrast-agnostic-softseg-spinalcord as the Name but with a different version.

@sandrinebedard Sure, we can create a new release v2.1 today and I will be using this for the JSON sidecars

NathanMolinier commented 4 months ago

For nih-ms-mp2rage I have generated these JSON files

{
  "GeneratedBy": [
    {
      "Name": "contrast-agnostic-softseg-spinalcord",
      "Version": "2.0"
    },
    {
      "Author": "Nilser Laines Medina",
      "Date": "2023-12-14",
      "Note": "Binarised spinal cord soft segmentation"
    }
  ]
}

The field "Name" is missing in this example. From where the second step was generated ? @Nilser3

Nilser3 commented 4 months ago

Thanks for the feedback @NathanMolinier I see that it was still not in agreement with the new convention, I think it would be better something like:

{
  "SpatialReference": "orig",
  "GeneratedBy": [
    {
      "Name": "contrast-agnostic-softseg-spinalcord",
      "Version": "2.0"
    },
    {
      "Name": "Manual",
      "Author": "Nilser Laines Medina",
      "Date": "2023-12-14",
      "Note": "Binarised spinal cord soft segmentation"
    }
  ]
}
valosekj commented 4 months ago

Just a nitpick, we concluded here that we should use yyyy-mm-dd hh:mm:ss format for Date to make it easy to distinguish the order of corrections.

NathanMolinier commented 4 months ago
{
  "SpatialReference": "orig",
  "GeneratedBy": [
    {
      "Name": "contrast-agnostic-softseg-spinalcord",
      "Version": "2.0"
    },
    {
      "Name": "Manual",
      "Author": "Nilser Laines Medina",
      "Date": "2023-12-14",
      "Note": "Binarised spinal cord soft segmentation"
    }
  ]
}

Just out of curiosity, did you really binarize manually ? Or did you use a custom script doing thresholding ?

Nilser3 commented 4 months ago

was binarized sct_maths after generating the soft masks,

but, I think I will remove this "Note" , because I will generate again these SC masks with the last version of contrast-agnostic-model (there the result is already binary).

NathanMolinier commented 4 months ago

was binarized sct_maths after generating the soft masks,

You should then specify the method sct_maths using this "Name" field instead of Manual, and potentially provide the command you ran, like below:

{
  "SpatialReference": "orig",
  "GeneratedBy": [
    {
      "Name": "contrast-agnostic-softseg-spinalcord",
      "Version": "2.0"
    },
    {
      "Name": "sct_maths",
      "Param": "-thr 0.8",
      "Version": "SCT v6.2",
      "Note": "Binarised spinal cord soft segmentation"
    }
  ]
}
naga-karthik commented 4 months ago

It's pretty cool that you can directly define the json dict and sct_run_batch script and can create the json file for each subject with the contents of the json dict.

code snippet ```python date_time=$(date +"%Y-%m-%d %H:%M:%S") json_dict='{ "GeneratedBy": [ { "Name": "contrast-agnostic-softseg-spinalcord", "Version": "2.1", "Date": "'$date_time'" } ] }' PATH_DATA_PROCESSED_CLEAN="${PATH_DATA_PROCESSED}_clean" # create new folder and copy only the predictions mkdir -p ${PATH_DATA_PROCESSED_CLEAN}/derivatives/labels_softseg_bin/${SUBJECT}/anat rsync -avzh ${file}_seg_monai.nii.gz ${PATH_DATA_PROCESSED_CLEAN}/derivatives/labels_softseg_bin/${SUBJECT}/anat/${file}_desc-softseg_label-SC_seg.nii.gz rsync -avzh ${file}_seg-manual.json ${PATH_DATA_PROCESSED_CLEAN}/derivatives/labels_softseg_bin/${SUBJECT}/anat/${file}_desc-softseg_label-SC_seg.json # create json file echo $json_dict > ${PATH_DATA_PROCESSED_CLEAN}/derivatives/labels_softseg_bin/${SUBJECT}/anat/${file}_desc-softseg_label-SC_seg.json # re-save json files with indentation python -c "import json; json_file = '${PATH_DATA_PROCESSED_CLEAN}/derivatives/labels_softseg_bin/${SUBJECT}/anat/${file}_desc-softseg_label-SC_seg.json' with open(json_file, 'r') as f: data = json.load(f) json.dump(data, open(json_file, 'w'), indent=4) " ```
contents of json file ``` { "GeneratedBy": [ { "Name": "contrast-agnostic-softseg-spinalcord", "Version": "2.1", "Date": "2024-03-08 17:46:09" } ] } ```