Update spinal levels files generated by the University of Calgary's Phillips Lab based on the Mendez et al. paper

sandrinebedard commented 1 year ago

Description

Partially fixes https://github.com/spinalcordtoolbox/spinalcordtoolbox/issues/3952.

This PR takes the unprocessed Philips Lab spinal level files (stored in the spinal_levels_PhillipsLab/ folder), modifies and renames them, adds them to the spinal_levels/ folder, then removes the original spinal_levels_PhillipsLab/ folder.

FOR NOW, NOT CONTINUED, see https://github.com/spinalcordtoolbox/PAM50/pull/3#issuecomment-1644622915

Processing

I noticed some differences between the data from the Phillips Lab and the current PAM50/spinal_levels:

~~Orientation LPIvs RPI~~ --> (see comment)
Resolution 1x1x1 vs 0.5x0.5x0.5
qform/sform
type float32 vs float62.

However, the matrix size is the same (141, 141, 991), resampling to 0.5mm would double the matrix size. Copying the header from the current PAM50/spinal_levels fixed the problem instead of resampling. I want to confirm this is ok.

Processing steps includes:

~~Reorient from LPI to RPI~~ (see comment)
Change data type from float64to float32
Copy header from current PAM50/spinal_levels
Rename files

Exact processing commands are included in the README.md of this PR.

Old version of processing steps

I included the processing scripts in the attached zip file: [process_spinal_levels.zip](https://github.com/spinalcordtoolbox/PAM50/files/10726837/process_spinal_levels.zip) (put the python script in the same directory as the bash script ) Here is what I ran for processing: ``` chmod +x ./process_spinal_levels.sh ./process_spinal_levels.sh "./pam50/Spinal Cord Levels NIfTI" ```

Bibliography

https://www.frontiersin.org/articles/10.3389/fneur.2016.00238/full

jcohenadad commented 1 year ago

Comparing b686796816549d4191dcb1a11a426b5145cfcfbd (_new) and https://github.com/spinalcordtoolbox/PAM50/commit/f8cfc4b331abfda1caaf0d9e72be94df2a9858ac (master):

C1

![anim](https://user-images.githubusercontent.com/2482071/218615447-0144016a-d5cf-43cd-845b-5cd26f6c4f78.gif)

C2

![anim](https://user-images.githubusercontent.com/2482071/218615509-064d5cf3-a798-49d9-8a8d-6030d233e19f.gif)

C3

![anim](https://user-images.githubusercontent.com/2482071/218615553-6d2fefa7-8085-49cd-9abd-43e96c9bebc2.gif)

C4

![anim](https://user-images.githubusercontent.com/2482071/218615582-0be3cd08-f0dd-49e7-8e96-8d4157165f71.gif)

Few comments:

Is that normal that C1 is completely blank?
Is that normal that in the new version, values seem almost binary, ie: the same value is spread out along the S-I axis, without "smoothness" to it. Eg: zoomed version of C2_new (same value across ~20 voxels). Maybe we should inquired the Phillips group about that?

![image](https://user-images.githubusercontent.com/2482071/218616040-6eac577f-4c13-4cfe-8f1c-9800ff30c252.png) With the histogram of C2_new: ![screenshot](https://user-images.githubusercontent.com/2482071/218616316-a82aa957-d19d-43ee-8e7e-3ead2929926d.png)
The cord seg and the spinal levels seem to be perfectly matching (good), although I have not verified all levels. @sandrinebedard have you verified that? See example of good matching below:

![anim](https://user-images.githubusercontent.com/2482071/218616851-15cb50e6-3cf7-41a5-b897-d769f7316f39.gif)
The world coordinate between the source (Phillips lab) and PAM50, for the same voxel location, is drastically different. I guess this this is the reason for the qform copy (see example below).

![image](https://user-images.githubusercontent.com/2482071/218617926-930ccce0-1ea4-43cf-9467-e1b152bcc0ac.png) ![image](https://user-images.githubusercontent.com/2482071/218618035-9fb90988-ba71-4c06-80ba-f5f07f12d5e6.png)

joshuacwnewton commented 1 year ago

For posterity, should we put these scripts somewhere or is it fine to keep it zipped in this PR? @joshuacwnewton

I think it would be nice to include the scripts in this repo, alongside the files themselves. (In past NeuroPoly datasets, sometimes I've seen additional README.md files that contain a text description + a code block with the processing steps. I like that method a lot, because it auto-displays on the GitHub page for the subfolder, and provides more room to write plain-English descriptions.)

The only issue here is that there are both shell commands (.sh) and some Python file renaming logic (.py), making it more of a chore to just put all of the code in a README.md file. :thinking:

Ah! I have an idea. We could pretty easily replace the Python script with some bash logic, since we already have access to info_label.txt, which stores the mappings between spinal level names and filenames. Then we could just have a single set of bash steps that we put into the README. I'll add it to this PR, should take 5m. :)

Is that normal that C1 is completely blank?

Related SCT issue:

https://github.com/spinalcordtoolbox/PAM50/issues/20

The cord seg and the spinal levels seem to be perfectly matching (good), although I have not verified all levels.

Possibly related past SCT discussion:

joshuacwnewton commented 1 year ago

One other small idea: It might be best to commit the unprocessed spinal level files to this repo first (in a separate PR https://github.com/spinalcordtoolbox/PAM50/pull/4), Then, we merge this PR, which will update the files.

This allows us to refer to a commit within this repo in the README.md, so that the source files are backed up safely, and we are not at risk of the PhillipsLab repo being deleted one day).

jcohenadad commented 1 year ago

another alternative would be to upload the source files (from Phillips) and the script as release assets when doing the release of the updated spinal levels-- arguments:

the repos is already pretty big (it took me quite some time to git clone)-- we wouldn't want to 'blow it out of proportion-- especially with increasingly more constraining regulation about repos size limit etc.
the script is very specific to the Phillips files, and i'm afraid if we put it in the repos, some ppl might misinterpret what it is for, etc.

sandrinebedard commented 1 year ago

The cord seg and the spinal levels seem to be perfectly matching (good), although I have not verified all levels. @sandrinebedard have you verified that? See example of good matching below:

I'll verify them all!

Is that normal that C1 is completely blank?

They did not include any C1 files

Is that normal that in the new version, values seem almost binary, ie: the same value is spread out along the S-I axis, without "smoothness" to it. Eg: zoomed version of C2_new (same value across ~20 voxels). Maybe we should inquired the Phillips group about that?

I'll also check the https://github.com/PhillipsLab/pam50/tree/main/DREZ%20NIfTI to see if the Spinal Cord levels was teh right folder to take here.

Is that normal that the resampling is commented out?

The world coordinate between the source (Phillips lab) and PAM50, for the same voxel location, is drastically different. I guess this this is the reason for the qform copy (see example below).

Regarding the differences in world coordinates and resampling, I wasn't sure how to proceed.

The dimensions and resolution of:

new spinal levels files are : 141x141x991 and 1x1x1
Old spinal levels files: 141x141x991 and 0.5x0.5x0.5

If we do resampling to 0.5 mm isotropic, the matrix size doubles, wich doesn't match anymore the PAM50 space.

I thought of doing registration with identity, however, since the world coordinates differ, it dosn't bring the spinal levels in the PAM50 space. How I overcame this issue is by copying the image header (for qform) and this also solves the resolution problem. I am not sure this is the best way to overcome this however...

joshuacwnewton commented 1 year ago

the repos is already pretty big (it took me quite some time to git clone)-- we wouldn't want to 'blow it out of proportion-- especially with increasingly more constraining regulation about repos size limit etc.

I think that this is the unavoidable nature of trying to version-track binary files like we're doing here. For example, the total size of the recently-added histology files is 93MB. By comparison, the original Phillips files are ~700kB each, for a total of 16.7MB.

With or without the additional Phillips files, slow repo clones will be an issue. So, if it is unavoidable, I lean towards keeping the files in-repo for posterity.

(Additionally, the GitHub repo size limit seems to be 5GB (strongly recommended), so 16.7MB hardly moves the needle in that regard.)

the script is very specific to the Phillips files, and i'm afraid if we put it in the repos, some ppl might interpret what it is for, etc.

My thought process was a README.md file containing the following contents:

### PAM50 Levels

These level files are a slightly modified copy of the [level files](https://github.com/PhillipsLab/pam50/tree/main/Spinal%20Cord%20Levels%20NIfTI) produced by the [Phillips Lab](https://github.com/PhillipsLab): 

Modifications include:

- Reorient from LPI to RPI
- Change data type from float64 to float32
- Copy header from current PAM50/spinal_levels 
- Rename files

To reproduce the modified files, please run `git checkout <commit>`, then run the following script in your terminal:

```
<insert commands here>
```

That way it is unambiguous what the commands were used for, and no one can just accidentally run the commands, similar to what was done for sct_testing_data/template/README.md.

jcohenadad commented 1 year ago

They did not include any C1 files

Hum, I am a bit reluctant to include a file labeled "C1 spinal levels" but without any label inside-- this is quite confusing for users. Maybe we should just get rid of it?

I'll also check the https://github.com/PhillipsLab/pam50/tree/main/DREZ%20NIfTI to see if the Spinal Cord levels was teh right folder to take here.

I think we should include the Phillips group right away in the conversation to avoid any confusion. I'll take care of that.

How I overcame this issue is by copying the image header (for qform) and this also solves the resolution problem.

Only changing the image header should not affect the physical dimension of the voxel, unless (i) you also did a resampling afterwards or (ii) the "1mm iso" on the native image was wrong, and the true voxel size it was in fact 0.5mm iso.

sandrinebedard commented 1 year ago

I found some samll mistmatch with the spinal cord (red) of the PAM50 and spinal levels:

(I did not pass yet the entire cord)

Slices: 976, 966, 963, 951, 942 (more to come)

It looks like the R-L are reveresed, this suggest that maybe the reorientation to RPI wasn't necessary and the image header is wrong, which maybe also suggest that the true voxel size is maybe also 0.5mm iso.

(ii) the "1mm iso" on the native image was wrong, and the true voxel size it was in fact 0.5mm iso.

I am testing the processing without reorientation to see if we still have mismatch.

sandrinebedard commented 1 year ago

No mismatch when I remove the reorientation to RPI: (slice 942) anim

jcohenadad commented 1 year ago

No mismatch when I remove the reorientation to RPI: (slice 942)

Ah! Good finding 😊

sandrinebedard commented 1 year ago

I updated the files removing the reorientation to RPI: https://github.com/spinalcordtoolbox/PAM50/pull/3/commits/ce64bdf30f7b544ae9eb5bf76dd883ecade01b0f process_spinal_levelsv2.zip

joshuacwnewton commented 1 year ago

I updated the files removing the reorientation to RPI: https://github.com/spinalcordtoolbox/PAM50/commit/ce64bdf30f7b544ae9eb5bf76dd883ecade01b0f (process_spinal_levelsv2.zip)

I have taken the above scripts and put them into a README.md file. But, I slightly modified the scripts to remove the dependency on Python. I also modified the script so that it can be run directly on commit e854bbad9ab550fd93acabeaf43c97cf66b3a4e5.

(The differences between "process_spinal_levelsv2.zip" and the README.md file are shown here.)

sandrinebedard commented 1 year ago

We decided not to pursue the inclusion of the Mendez et al. paper metrics to update the spinal level in the PAM50 template . Here is a summary of what was done with the related issues:

The updated version from Phillips Lab had some limitations related to linear scaling : spinalcordtoolbox/spinalcordtoolbox#10
We tried using the distance intervertebral foramen to rostral/caudal rootlets at dorsal entry to identify the spinal levels. We manually labeled the intervertebral foramen in the PAM50. However, the scaling was wrong when brining the measures in the PAM50 template : spinalcordtoolbox/spinalcordtoolbox#12
We tested using a ratio PAM50-Mendez al. measures of the intervertebral foramen-rootlets: We manually labeled the dorsal rostral and caudal rootlets entry and the intervertebral foramen in the PAM50 template. We computed a ratio between the distance formen-rostral/caudal rootlets in the Mendez et al. paper vs the PAM50. The ratio was not very stable, already at C8 the estimation was bad.: https://github.com/spinalcordtoolbox/PAM50/issues/12#issuecomment-1644621236

spinalcordtoolbox / PAM50