Add evaluations of data quality by radiologists

jcohenadad commented 2 years ago

This will be taken care of by Marcella Lagana and @renelabounek

renelabounek commented 2 years ago

Step 1: Define and unify qualitative assessment which we request An experienced radiologist scored the image between 0 (worst) and 5 (best), based on:

The diagnostic relevance of the image
The ability to distinguish GM from WM (contrast)
The sharpness of the GM/WM border (affected by spatial resolution and motion)
Signal drop-out due to intravoxel dephasing
Ghosting artifacts (patterns of overlapped tissue on the cord along the phase-encoding direction).

I am sceptical to the point "The diagnostic relevance of the image". All images acquired the same tissue in the same FOV. Therefore, all images should have the same diagnostic relevance. I do not see what difference should be expected here over submissions.

Step 2: Make an instructions how to score each requested property: The dynamic range 0-5 gives me a binary excelent (1) or terible (0) for each of 5 listed properties. Thus, the scorers should use decimal values for requested properties over submissions; e.g. ability to distinguish GM from WM is terrbile for submission1 (score: 0.0), moderate for submission2 (score: 0.5) and excellent for submission3 (score: 1.0).

Step 3: Which image/s will they assess? First scan of each submission? Or both scans and final score per submission will be an average over two scans? I see 13 submissions. Thus, scorers will assess 13 or 26 scans.

Step 4: Fill assessment table I have designed a google sheet which scorers can copy, fill and send us back via email. I assumed they will assess both scans per submission. Is the table clear?

Step 5: List of scorers and contacts to them I assume we will switch the list over email between us.

Anything other needs to be prepared to make the work for scorers as easy as possible?

jcohenadad commented 2 years ago

Step 4: Fill assessment table I have designed a google sheet which scorers can copy, fill and send us back via email. I assumed they will assess both scans per submission. Is the table clear?

Great start 🚀 However the dataset has now changed to be compatible with BIDS (https://github.com/sct-pipeline/gm-challenge-data/pull/2). Also note that one site (Zurich) will submit another dataset next week because their first submission was incomplete.

Also note that some entries in the gsheet as currently should not be evaluated. I have opened an issue to discard dataset from evaluation.

renelabounek commented 2 years ago

I have updated the template sheet. Session sub-10328 has been removed, regarding the issue. Which submission is Zurich? The removed sub-10328? Or will another submission ID be substituted?

jcohenadad commented 2 years ago

I have updated the teplate sheet. Session sub-10328 has been removed, regarding the issue. Which submission is Zurich? The removed sub-10328? Or will another submission ID be substituted?

they are supposed to submit another scan in a few days.

MarcellaLagana commented 2 years ago

Renè, I and the MD agreed about: 1) Removing "The diagnostic relevance of the image" question

2) Change the score range from 0-5 to 1-5 (1:worse; 3:moderate; 5:best) because scale 0-5 does not have a middle value

3) In the shared _Sheet_, we put a sagittal scan as an example for explaining "Signal drop-out due to intravoxel dephasing". We will try to find an axial scan. If you have it, can you share it, please?

4) We will reorder the scans in a randomized order, to blind the MD for repetitive scans. Otherwise, they would probably assign the same score if they see two consecutive scans. We will modify the google sheet regarding the randomized data that we'll share between us. We will have the correspondence between the old and new IDs, but the MD will see data with the new ID only.

5) Can you please let us know when the missing scan will be uploaded? My MD will assign the scores on Monday and she will be on vacation all August.

renelabounek commented 2 years ago

Randomized scans are public available via google drive. Two scans miss because they have not been uploaded at the github yet. Randomization table was privately shared. Sheet for quality assessment has been updated to fit the randomized scans. Same sheet in the github format is available here, in case we will need to reorganize results of the randomized order into the github order.

MarcellaLagana commented 2 years ago

I found the following image for showing the "Signal drop-out due to intravoxel dephasing" in an axial spinal cord slice: It is Figure 4.1.8 of the book "Quantitative MRI of the Spinal Cord" (Editors Julien and Claudia), so we can cite it.

renelabounek commented 2 years ago

Randomized data update:

two missing scans added

Sheet update:

image of example of axial signal drop-out due to intravoxel dephasing
URL link for data download added

I have forwarded materials to scorers.

renelabounek commented 2 years ago

One scoring MD has had an excellent point which we missed:

"It would be beneficial to add one more parameter- the contrast between the spinal cord and CSF - in some images, T2 W decreases and the contours of the spinal cord disappear."

I believe, we should ask scorers for this missing variable too, because it will directly influence quality of SC segmentation.

MarcellaLagana commented 2 years ago

Also my MD saw the same problem for a few cases, but she's on vacation now and she scored all the cases before leaving

Il dom 1 ago 2021, 16:39 Rene Labounek @.***> ha scritto:

One scoring MD has had an excellent point which we missed:

"It would be beneficial to add one more parameter- the contrast between the spinal cord and CSF - in some images, T2 W decreases and the contours of the spinal cord disappear."

I believe, we should ask scorers for this missing variable too, because it will directly influence quality of SC segmentation.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/sct-pipeline/gm-challenge/issues/53#issuecomment-890532044, or unsubscribe https://github.com/notifications/unsubscribe-auth/APWIAFD7VTPXFKAS7V74L6DT2VMAPANCNFSM5APWTFVQ .

renelabounek commented 2 years ago

Graphs of qualitative assessment generated and added into manuscript. Source code for graph reproducibility here: #63 Due to time issues, CSF-SC contrast assessment has not been evaluated. Two independent radiologists assessed all MRI scans. Their scores are attached in the pulled file qualitative_assessment.xlsx Graphs in the manuscript demonstrate high level of agreement over both radiologists.

jcohenadad commented 2 years ago

Fixed via https://github.com/sct-pipeline/gm-challenge/pull/63

sct-pipeline / gm-challenge

Add evaluations of data quality by radiologists #53