Closed jcohenadad closed 1 year ago
From @valosekj
What's the size of all these files? (related to #7)
After resampling all files to 0.083333, 0.083333, 0.5
, the total size is 179M:
valosek@macbook-pro:~/code/PAM50/histology$ du -h
179M .
i.e., a dramatic increase from current 92M:
valosek@macbook-pro:~/code/PAM50/histology_backup$ du -h
92M .
I tried to change gzip
compression ratio from default 6 to 9 ("slowest compression level, which provides the smallest file size"), but the size decrease is negligible:
valosek@macbook-pro:~/code/PAM50/histology$ du -h
177M .
I also tried bzip2
; the compression performance is similar.
We can try to play with the data type, which is currently float32.
I tried to change gzip compression ratio from default 6 to 9 ("slowest compression level, which provides the smallest file size"), but the size decrease is negligible:
I would not play with this.
FLOAT32 should be the one to use, I think.
Alternatively, we change the resampling to 200 µm x 200 µm x 500 µm. I think it should be "good enough" for what our users will do with the data (ie: compare with MRI)
Alternatively, we change the resampling to 200 µm x 200 µm x 500 µm. I think it should be "good enough" for what our users will do with the data (ie: compare with MRI)
This sounds good!
I used:
valosek@macbook-pro:~/code/PAM50/histology_200x200x500um$ for file in *nii.gz;do;sct_resample -i ${file} -mm 0.2x0.2x0.5 -x linear;done
Then, dim
is [3, 108, 83, 205, 1, 1, 1, 1]
and pixdim
is [-1.0, 0.199074, 0.198795, 0.5, 0.0, 0.0, 0.0, 0.0]
:
valosek@macbook-pro:~/code/PAM50/histology_200x200x500um$ for file in *.gz;do echo ${file};sct_image -i ${file} -header | grep dim | head -3;echo "";done
PAM50_200um_AVF.nii.gz
dim [3, 108, 83, 205, 1, 1, 1, 1]
pixdim [-1.0, 0.199074, 0.198795, 0.5, 0.0, 0.0, 0.0, 0.0]
PAM50_200um_Eccentricity.nii.gz
dim [3, 108, 83, 205, 1, 1, 1, 1]
pixdim [-1.0, 0.199074, 0.198795, 0.5, 0.0, 0.0, 0.0, 0.0]
PAM50_200um_EquivDiameter.nii.gz
dim [3, 108, 83, 205, 1, 1, 1, 1]
pixdim [-1.0, 0.199074, 0.198795, 0.5, 0.0, 0.0, 0.0, 0.0]
PAM50_200um_EquivDiameter14.nii.gz
dim [3, 108, 83, 205, 1, 1, 1, 1]
pixdim [-1.0, 0.199074, 0.198795, 0.5, 0.0, 0.0, 0.0, 0.0]
PAM50_200um_EquivDiameter48.nii.gz
dim [3, 108, 83, 205, 1, 1, 1, 1]
pixdim [-1.0, 0.199074, 0.198795, 0.5, 0.0, 0.0, 0.0, 0.0]
PAM50_200um_MVF.nii.gz
dim [3, 108, 83, 205, 1, 1, 1, 1]
pixdim [-1.0, 0.199074, 0.198795, 0.5, 0.0, 0.0, 0.0, 0.0]
PAM50_200um_Naxons.nii.gz
dim [3, 108, 83, 205, 1, 1, 1, 1]
pixdim [-1.0, 0.199074, 0.198795, 0.5, 0.0, 0.0, 0.0, 0.0]
The size is now 31M:
valosek@macbook-pro:~/code/PAM50/histology_200x200x500um$ du -h
31M .
since you are doing a resampling anyway, how about using a 'round' 0.2 resolution (vs. 0.1991...)
since you are doing a resampling anyway, how about using a 'round' 0.2 resolution (vs. 0.1991...)
I indeed intended this using sct_resample -i ${file} -mm 0.2x0.2x0.5 -x linear
(see the first command in https://github.com/spinalcordtoolbox/PAM50/issues/7#issuecomment-1483096525). But still, the resulting resolution is 0.199074x0.198795x0.5
.
I indeed intended this using sct_resample -i ${file} -mm 0.2x0.2x0.5 -x linear (see the first command in https://github.com/spinalcordtoolbox/PAM50/issues/7#issuecomment-1483096525). But still, the resulting resolution is 0.199074x0.198795x0.5.
interesting-- this is something we should raise as an SCT issue-- maybe related to some precision/rounding issues with the library used to do the resampling... we should clarify what the cause of this discrepancy is
but regarding this project, if we want to move forward quickly, i'd say let's go with 0.199...
I indeed intended this using sct_resample -i ${file} -mm 0.2x0.2x0.5 -x linear (see the first command in #7 (comment)). But still, the resulting resolution is 0.199074x0.198795x0.5.
interesting-- this is something we should raise as an SCT issue-- maybe related to some precision/rounding issues with the library used to do the resampling... we should clarify what the cause of this discrepancy is
but regarding this project, if we want to move forward quickly, i'd say let's go with 0.199...
Documented in https://github.com/spinalcordtoolbox/spinalcordtoolbox/issues/4077
I think that sct_resample -mm 0.2
does not work as expected in this particular case due to a specific combination of dim=258
and pixdim= 0.083333
:
julien-macbook:~/code/PAM50/histology $ sct_image -i PAM50_200um_Naxons.nii.gz -header | grep dim dim [3, 258, 198, 205, 1, 1, 1, 1] pixdim [-1.0, 0.083333, 0.083333, 0.5, 0.0, 0.0, 0.0, 0.0]
Spin-off of this comment:
I think it would be nice to include the histology atlas upon installation, if it is not "too" large. And i think we should be able to reduce it to 50MB (currently 92MB).