Closed alexfoias closed 4 years ago
@PaulBautin Could you please check the qc to make sure that all images are okay ? A zip file is on the top of the PR. Thanks.
@PaulBautin Could you please check the qc to make sure that all images are okay ? A zip file is on the top of the PR. Thanks.
Should i look if the brain is defaced? The SC looks intact, and everything looks good.
Yes, Could you please check if it's properly defaced (brain and/or spine not chopped) ? Thanks.
On second review (tell me what you think of these errors): On these images the defacing seems to be touching the upper part of the spinal cord
On these images the defacing seems to be chopping a part of the brain:
indeed, these need to be fixed. thanks @PaulBautin
@alexfoias i'm just realizing this PR is a mix between fixing defacing and adding new data-- please separate the two things for clarity and better tracking recording. Defacing should go in this other PR https://github.com/spine-generic/data-multi-subject/pull/19. Also please move the defacing-specific comments to the other PR.
@jcohenadad Sorry for the mix in the changes, but I did the checkout from my master branch after adding the fixed defaced image. I'll fix this PR just to contain the fslAchieva & fslPrisma changes and reopen a new PR for the fixed defaced images.
I fixed the PR. Please find the qc here.qc_dataset_20200803.zip
I fixed the PR.
~the branch still contains modifications that are not related to the PR purpose, i.e. added fsl* data:~
The branch shows some files that are modified (i.e. non-zero BIN), despite only name was supposed to have changed.
That was the rename of the derivatives. Do you want me to remove them and only keep the raw data ? Who will take of the derivatives ?
That was the rename of the derivatives
ah! ok that makes sense
https://github.com/spine-generic/data-multi-subject/pull/21#issuecomment-668035623: this PR is about adding fslAchieva & fslPrisma. Nothing more, nothing less.
To validate the QC of defacing, please refer to https://github.com/spine-generic/data-multi-subject/pull/19
@jcohenadad I did a test with the same command as I changed the name of derivatives for sapienza:
Here is the command:
import os
old_path = '/Users/alfoi/code/data_fork_multi_subject/derivatives/labels/sub-amu02/anat/sub-amu02_T1w_RPI_r_labels-manual.nii.gz'
new_path = '/Users/alfoi/code/data_fork_multi_subject/derivatives/labels/sub-amu02/anat/sub-amu02_T1w_RPI_r_labels-manual_new.nii.gz'
os.rename(old_path,new_path)
When pushing on the branch it shows as:
I'll delete the amu commit from PR after I have your okay.
b00b64be99a1a632a93179719f2dee4911ec63c2
-rwx------ 1 julien staff 557857 14 Jul 21:22 sub-amu02_T1w_RPI_r_labels-manual.nii.gz
aa3087640f297829c3f631588f37b7ef9e7a111e
-rwxr-xr-x 1 julien staff 557857 3 Aug 15:36 sub-amu02_T1w_RPI_r_labels-manual_new.nii.gz
as i told you in the past, you should be careful when moving/renaming files to not change their permissions.
some doc about keeping permissions: https://stackoverflow.com/questions/12181938/how-to-preserve-default-umask-if-renaming-a-tempfile
There's something happening that I don't understand, because for me I have the same permissions.
Is it probably the problem because I work on a fork ? The commit https://github.com/spine-generic/data-multi-subject/commit/aa3087640f297829c3f631588f37b7ef9e7a111e is from the forked repo.
On my fork (https://github.com/alexfoias/data-multi-subject) everything seems okay and it doesn't seem to change the permissions:
Here is my output for c6f2fb49a9a797dcec121a0eb6487af0aa811739:
(base) alfoi@MacBook-Pro:data_fork_multi_subject % git checkout c6f2fb49a9a797dcec121a0eb6487af0aa811739
(base) alfoi@MacBook-Pro:anat % git rev-parse HEAD
c6f2fb49a9a797dcec121a0eb6487af0aa811739
(base) alfoi@MacBook-Pro:anat % pwd
/Users/alfoi/code/data_fork_multi_subject/derivatives/labels/sub-amu02/anat
(base) alfoi@MacBook-Pro:anat % ll
total 1096
drwxr-xr-x 3 alfoi staff 96 3 Aug 16:44 .
drwxr-xr-x 3 alfoi staff 96 3 Aug 16:44 ..
-rwxr-xr-x 1 alfoi staff 557857 3 Aug 16:44 sub-amu02_T1w_RPI_r_labels-manual.nii.gz
Here is my output for aa3087640f297829c3f631588f37b7ef9e7a111e:
(base) alfoi@MacBook-Pro:anat % git checkout aa3087640f297829c3f631588f37b7ef9e7a111e
(base) alfoi@MacBook-Pro:anat % git rev-parse HEAD
aa3087640f297829c3f631588f37b7ef9e7a111e
(base) alfoi@MacBook-Pro:anat % pwd
/Users/alfoi/code/data_fork_multi_subject/derivatives/labels/sub-amu02/anat
(base) alfoi@MacBook-Pro:anat % ll
total 1096
drwxr-xr-x 3 alfoi staff 96 3 Aug 16:46 .
drwxr-xr-x 3 alfoi staff 96 3 Aug 16:46 ..
-rwxr-xr-x 1 alfoi staff 557857 3 Aug 16:46 sub-amu02_T1w_RPI_r_labels-manual_new.nii.gz
@kousu do you have any insights?
@jcohenadad Sorry for the mix in the changes, but I did the checkout from my master branch after adding the fixed defaced image. I'll fix this PR just to contain the fslAchieva & fslPrisma changes and reopen a new PR for the fixed defaced images.
Do we need to purge the previous images completely? Were they already defaced, just incorrectly, or were they not defaced at all? If the latter, we should remember they'll still be in the git log even after this PR goes in, git rm
isn't enough to hide them. If we need to purge them then we need to do surgery on master
to get them out (surgery which will need one extra step after #20 goes in) so I'd like to be involved to hand-hold that procedure.
And I'm really really sorry about this but #20 is going to force you to redo this PR from scratch. Can you save your new files separately from the repo? You'll have to do that then refork and redownload the repo and reapply the work.
-rwx------ 1 julien staff 557857 14 Jul 21:22 sub-amu02_T1w_RPI_r_labels-manual.nii.gz
-rwxr-xr-x 1 julien staff 557857 3 Aug 15:36 sub-amu02_T1w_RPI_r_labels-manual_new.nii.gz
as i told you in the past, you should be careful when moving/renaming files to not change their permissions.
some doc about keeping permissions: https://stackoverflow.com/questions/12181938/how-to-preserve-default-umask-if-renaming-a-tempfile
I see them as 0o0755 like Alex does. I downloaded the repo (it's 21G on disk, ay yi yi) and looked at the permissions changed on this branch for that folder. It does not show any permissions changes:
$ git log --summary b25b78d.. -- derivatives/
commit aa3087640f297829c3f631588f37b7ef9e7a111e (HEAD -> add_fsl_dataset, origin/add_fsl_dataset)
Author: Alexandru Foias <alexandrufoias@gmail.com>
Date: Mon Aug 3 11:59:50 2020 -0400
rename test
rename derivatives/labels/sub-amu02/anat/{sub-amu02_T1w_RPI_r_labels-manual.nii.gz => sub-amu02_T1w_RPI_r_labels-manual_new.nii.gz} (100%)
commit c6f2fb49a9a797dcec121a0eb6487af0aa811739
Merge: 00ccd02 b00b64b
Author: Alexandru Foias <alexandrufoias@gmail.com>
Date: Mon Aug 3 09:42:52 2020 -0400
Merge branch 'master' of https://github.com/spine-generic/data-multi-subject into add_fsl_dataset
commit 7a2e9d85c56d8473e98a88ce752712faf9d635f9
Author: Alexandru Foias <alexandrufoias@gmail.com>
Date: Fri Jul 31 15:10:09 2020 -0400
rename sapienza to fslAchieva
rename derivatives/labels/{sub-sapienza01/anat/sub-sapienza01_T1w_RPI_r_labels-manual.nii.gz => sub-fslAchieva01/anat/sub-fslAchieva01_T1w_RPI_r_labels-manual.nii.gz} (100%)
rename derivatives/labels/{sub-sapienza01/anat/sub-sapienza01_T2star_rms_gmseg-manual.nii.gz => sub-fslAchieva01/anat/sub-fslAchieva01_T2star_rms_gmseg-manual.nii.gz} (100%)
rename derivatives/labels/{sub-sapienza01/anat/sub-sapienza01_acq-T1w_MTS_seg-manual.nii.gz => sub-fslAchieva01/anat/sub-fslAchieva01_acq-T1w_MTS_seg-manual.nii.gz} (100%)
rename derivatives/labels/{sub-sapienza01/dwi/sub-sapienza01_dwi_moco_dwi_mean_seg-manual.json => sub-fslAchieva01/dwi/sub-fslAchieva01_dwi_moco_dwi_mean_seg-manual.json} (100%)
rename derivatives/labels/{sub-sapienza01/dwi/sub-sapienza01_dwi_moco_dwi_mean_seg-manual.nii.gz => sub-fslAchieva01/dwi/sub-fslAchieva01_dwi_moco_dwi_mean_seg-manual.nii.gz} (100%)
rename derivatives/labels/{sub-sapienza03/anat/sub-sapienza03_acq-T1w_MTS_seg-manual.nii.gz => sub-fslAchieva03/anat/sub-fslAchieva03_acq-T1w_MTS_seg-manual.nii.gz} (100%)
rename derivatives/labels/{sub-sapienza04/anat/sub-sapienza04_T2star_rms_gmseg-manual.nii.gz => sub-fslAchieva04/anat/sub-fslAchieva04_T2star_rms_gmseg-manual.nii.gz} (100%)
rename derivatives/labels/{sub-sapienza04/anat/sub-sapienza04_acq-T1w_MTS_seg-manual.nii.gz => sub-fslAchieva04/anat/sub-fslAchieva04_acq-T1w_MTS_seg-manual.nii.gz} (100%)
rename derivatives/labels/{sub-sapienza05/anat/sub-sapienza05_acq-T1w_MTS_seg-manual.nii.gz => sub-fslAchieva05/anat/sub-fslAchieva05_acq-T1w_MTS_seg-manual.nii.gz} (100%)
rename derivatives/labels/{sub-sapienza06/anat/sub-sapienza06_acq-T1w_MTS_seg-manual.nii.gz => sub-fslAchieva06/anat/sub-fslAchieva06_acq-T1w_MTS_seg-manual.nii.gz} (100%)
commit d8b444f7e7f956bfee7739cd50f7f2dbe768f404
Author: PaulBautin <pbautin70@gmail.com>
Date: Fri Jul 31 12:23:37 2020 -0400
- remove cmrra02
- remove sapienza04 gmseg
- correct gmseg for sapienza01
delete mode 100644 derivatives/labels/sub-cmrra02/anat/sub-cmrra02_T2w_RPI_r_seg-manual.json
delete mode 100644 derivatives/labels/sub-cmrra02/anat/sub-cmrra02_T2w_RPI_r_seg-manual.nii.gz
delete mode 100644 derivatives/labels/sub-sapienza04/anat/sub-sapienza04_T2star_rms_gmseg-manual.json
delete mode 100755 derivatives/labels/sub-sapienza04/anat/sub-sapienza04_T2star_rms_gmseg-manual.nii.gz
On aa3087640f297829c3f631588f37b7ef9e7a111e I see the same as Alex:
$ ls -l derivatives/labels/sub-amu02/anat/sub-amu02_T1w_RPI_r_labels-manual_new.nii.gz
-rwxr-xr-x 1 kousu kousu 557857 Aug 4 18:05 derivatives/labels/sub-amu02/anat/sub-amu02_T1w_RPI_r_labels-manual_new.nii.gz
on b00b64be99a1a632a93179719f2dee4911ec63c2 -- from master
-- I see the same as Alex:
$ git checkout b00b64be99a1a632a93179719f2dee4911ec63c2
HEAD is now at b00b64b Merge pull request #16 from PaulBautin/pb_corrections_t1w_t2w
$ ls -l derivatives/labels/sub-amu02/anat/sub-amu02_T1w_RPI_r_labels-manual.nii.gz
-rwxr-xr-x 1 kousu kousu 557857 Aug 4 19:24 derivatives/labels/sub-amu02/anat/sub-amu02_T1w_RPI_r_labels-manual.nii.gz
I'm not sure why your system is being strange @jcohenadad. How did you checkout @alexfoias' code locally? You should have done
git add remote alexfoias https://github.com/alexfoias/data-multi-subject.git
git fetch alexfoias
git checkout alexfoias/add_fsl_dataset
Is it possible you did something else?
Maybe you have something unusual in your git config -l
that explains it?
Are you working on your usual disk, or are you on a different disk that maybe has a different filesystem that is behaving differently?
Do we need to purge the previous images completely? Were they already defaced, just incorrectly, or were they not defaced at all? If the latter, we should remember they'll still be in the git log even after this PR goes in,
git rm
isn't enough to hide them.
@kousu i've configured the repos to only allow squash and merge (although you told me you changed this behaviour temporarily), which (i think you confirmed in the past) does get rid of committed elements within the branch, right?
I'm not sure why your system is being strange @jcohenadad. How did you checkout @alexfoias' code locally?
i did this (as instructed on this GH page under "Step 1: From your project repository, check out a new branch and test the changes."):
git checkout -b "alexfoias-add_fsl_dataset" "master"
git pull https://github.com/alexfoias/data-multi-subject.git "add_fsl_dataset"
I see them as 0o0755 like Alex does. I downloaded the repo (it's 21G on disk, ay yi yi) and looked at the permissions changed on this branch for that folder. It does not show any permissions changes:
I was basing myself on the change observed on GH, as described in https://github.com/spine-generic/data-multi-subject/pull/21#issuecomment-668060031. At the end, this is what matters, right? Maybe there are two competing effects, which are generating confusion: (1) different permissions on my end (for whatever reasons) and (2) actual changes (not permissions, but changes in the binaries), which explains the non-zero BIN on the github's diff in https://github.com/spine-generic/data-multi-subject/pull/21#issuecomment-668060031. If this is true, the latter is very problematic because files should be the same.
Do we need to purge the previous images completely? Were they already defaced, just incorrectly, or were they not defaced at all? If the latter, we should remember they'll still be in the git log even after this PR goes in,
git rm
isn't enough to hide them.@kousu i've configured the repos to only allow squash and merge (although you told me you changed this behaviour temporarily), which (i think you confirmed in the past) does get rid of committed elements within the branch, right?
Squash-merge is a special case of rebasing. It only applies to the code in the PR. If there are non-defaced images on master
then you need to do deeper surgery on the history to get rid of them.
But maybe I'm misunderstanding and blowing this out of proportion. Were non-defaced images ever uploaded? Or just incorrectly defaced images? Were any of the incorrectly defaced images effectively non-defaced?
If there are non-defaced images on master then you need to do deeper surgery on the history to get rid of them.
that's not the case, so all is good
EDIT 2020-08-05 12:35:47: "all is good" about the defaced images in the history-- but the issue about diff still remains https://github.com/spine-generic/data-multi-subject/pull/21#issuecomment-668894683
I see them as 0o0755 like Alex does. I downloaded the repo (it's 21G on disk, ay yi yi) and looked at the permissions changed on this branch for that folder. It does not show any permissions changes:
I was basing myself on the change observed on GH, as described in #21 (comment). At the end, this is what matters, right? Maybe there are two competing effects, which are generating confusion: (1) different permissions on my end (for whatever reasons) and (2) actual changes (not permissions, but changes in the binaries), which explains the non-zero BIN on the github's diff in #21 (comment). If this is true, the latter is very problematic because files should be the same.
I've figured it out. If you look a bit closer you can see this pattern: ever sidecar say "rename" and every image says "BIN +
But actually are renames. The log I quoted above shows renames.
commit 7a2e9d85c56d8473e98a88ce752712faf9d635f9
Author: Alexandru Foias <alexandrufoias@gmail.com>
Date: Fri Jul 31 15:10:09 2020 -0400
rename sapienza to fslAchieva
rename derivatives/labels/{sub-sapienza01/anat/sub-sapienza01_T1w_RPI_r_labels-manual.nii.gz => sub-fslAchieva01/anat/sub-fslAchieva01_T1w_RPI_r_labels-manual.nii.gz} (100%)
rename derivatives/labels/{sub-sapienza01/anat/sub-sapienza01_T2star_rms_gmseg-manual.nii.gz => sub-fslAchieva01/anat/sub-fslAchieva01_T2star_rms_gmseg-manual.nii.gz} (100%)
rename derivatives/labels/{sub-sapienza01/anat/sub-sapienza01_acq-T1w_MTS_seg-manual.nii.gz => sub-fslAchieva01/anat/sub-fslAchieva01_acq-T1w_MTS_seg-manual.nii.gz} (100%)
rename derivatives/labels/{sub-sapienza01/dwi/sub-sapienza01_dwi_moco_dwi_mean_seg-manual.json => sub-fslAchieva01/dwi/sub-fslAchieva01_dwi_moco_dwi_mean_seg-manual.json} (100%)
rename derivatives/labels/{sub-sapienza01/dwi/sub-sapienza01_dwi_moco_dwi_mean_seg-manual.nii.gz => sub-fslAchieva01/dwi/sub-fslAchieva01_dwi_moco_dwi_mean_seg-manual.nii.gz} (100%)
rename derivatives/labels/{sub-sapienza03/anat/sub-sapienza03_acq-T1w_MTS_seg-manual.nii.gz => sub-fslAchieva03/anat/sub-fslAchieva03_acq-T1w_MTS_seg-manual.nii.gz} (100%)
rename derivatives/labels/{sub-sapienza04/anat/sub-sapienza04_T2star_rms_gmseg-manual.nii.gz => sub-fslAchieva04/anat/sub-fslAchieva04_T2star_rms_gmseg-manual.nii.gz} (100%)
rename derivatives/labels/{sub-sapienza04/anat/sub-sapienza04_acq-T1w_MTS_seg-manual.nii.gz => sub-fslAchieva04/anat/sub-fslAchieva04_acq-T1w_MTS_seg-manual.nii.gz} (100%)
rename derivatives/labels/{sub-sapienza05/anat/sub-sapienza05_acq-T1w_MTS_seg-manual.nii.gz => sub-fslAchieva05/anat/sub-fslAchieva05_acq-T1w_MTS_seg-manual.nii.gz} (100%)
rename derivatives/labels/{sub-sapienza06/anat/sub-sapienza06_acq-T1w_MTS_seg-manual.nii.gz => sub-fslAchieva06/anat/sub-fslAchieva06_acq-T1w_MTS_seg-manual.nii.gz} (100%)
It's just a glitch in the Github UI.
You can verify the contents haven't changed by running sha256
on the files in both branches. Here, this should work on macOS I think:
git checkout master
sha256 derivatives/labels/sub-sapienza04/anat/sub-sapienza04_acq-T1w_MTS_seg-manual.nii.gz
git checkout alexfoias-add_fsl_dataset
sha256 sub-fslAchieva04/anat/sub-fslAchieva04_acq-T1w_MTS_seg-manual.nii.gz
or you could use this, which should be much quicker:
git show master:derivatives/labels/sub-sapienza04/anat/sub-sapienza04_acq-T1w_MTS_seg-manual.nii.gz | sha256
git show alexfoias-add_fsl_dataset:sub-fslAchieva04/anat/sub-fslAchieva04_acq-T1w_MTS_seg-manual.nii.gz | sha256
It might be helpful to know that git doesn't actually know about renames! Renames are not part of its data structure. Adds and deletions are. Every time it says "renamed" in the git log it's actually because it's detected an add/remove pair in the data structure with the same content on both sides.
So what Github is doing is not, technically, incorrect.
It might be helpful to know that git doesn't actually know about renames! Renames are not part of its data structure. Adds and deletions are. Every time it says "renamed" in the git log it's actually because it's detected an add/remove pair in the data structure with the same content on both sides.
So what Github is doing is not, technically, incorrect.
interesting. So, does it mean that the repository grows with renaming files?
what's funny is that some renamed file show no changes:
but again, that could be the glitch you mentioned above
or: is it possible that non-binary text files (like json) are inspected, and if no change is detected, then git does not make the repository "grow" for those files.
It might be helpful to know that git doesn't actually know about renames! Renames are not part of its data structure. Adds and deletions are. Every time it says "renamed" in the git log it's actually because it's detected an add/remove pair in the data structure with the same content on both sides. So what Github is doing is not, technically, incorrect.
interesting. So, does it mean that the repository grows with renaming files?
No, because the actual content is in ~/.git/objects and that stores content by hash -- the same way git-annex works, incidentally (gorey details here and here). So no, moving a file doesn't grow the repo.
what's funny is that some renamed file show no changes:
All the files that github shows as "renamed with no changes" are .json
. All the ones it shows with a size are .nii.gz
. I bet they're running diff
to decide how to generate that section which reacts differently to "text" and "binary" files. Notice that the .nii.gz
files have a little ->
arrow with the two filenames on either side in the header, so Github is recognizing it as a rename, it's just using a different display template.
@kousu is that OK if we merge this PR, or should we rather do it in https://github.com/kousu/spine-generic-data-multi-subject/tree/master-reannexed?
@alexfoias the PR title says "WIP: …" --> is it still WIP or ready to be reviewed?
@jcohenadad the PR is ready for review. I fixed the renaming for the test sub-amu02.
@jcohenadad the PR is ready for review. I fixed the renaming for the test sub-amu02.
~thx-- i've added this in the list of changes for this PR https://github.com/spine-generic/data-multi-subject/pull/21#issue-460426902~
^ scrap that. @alexfoias comment "I fixed the renaming for the test sub-amu02" was referring to the renaming test he has been conducting for the past commits, to check how file size is modified in github. So, this change is not part of this PR.
CHANGES:
Note: sapienza subjects mentioned in spine-generic pipeline have to be renamed to fslAchieva. It's the case in https://github.com/spine-generic/spine-generic/blob/1329c37e40569ab22c37526d2d3ac9eb2b8f23a6/spinegeneric/cli/generate_figure.py
Here is the qc of the complete dataset: qc_dataset.zip