spine-generic / data-multi-subject

Multi-subject data for the Spine Generic project
Creative Commons Attribution 4.0 International
22 stars 15 forks source link

Adding height and weight into participants.tsv and participants.json #57

Closed renelabounek closed 3 years ago

renelabounek commented 3 years ago

If other users will delete the hidden character separating columns during height/weight value filling, I don't believe most people will repair the table. I have not found what the character is separating the columns. I made it work only with CTRL+C CTRL+V of 2 columns with space selection before and after string, then I have rewritten the strings. The hidden character is not space or Tab. I did not find what is the character and how to normally write it.

jcohenadad commented 3 years ago

Thank you for the contribution @renelabounek. We've discussed with @alexfoias and he will implement an additional test in our CI to make sure there is no error introduced in participants.tsv (e.g. it is easy to miss a space/tab, which would introduce inconsistencies in the table).

Once the test is introduced, it will be merged so this PR will be properly tested.

renelabounek commented 3 years ago

"This branch is out-of-date with the base branch Merge the latest changes from master into this branch."

Do I need to make some changes/updates to get it into the spine-generic:master version? Or just click "Update branch"? Or do nothing?

jcohenadad commented 3 years ago

Do I need to make some changes/updates to get it into the spine-generic:master version? Or just click "Update branch"? Or do nothing?

Please do not click on "update branch" as this repos is synced between github and git-annex (via S3 Amazon), so the proper way to do it is via a rebase, done via the Terminal. I'll take care of it.

kousu commented 3 years ago

Those two files aren't annexed -- only the NIfTi files are -- so regular git rules apply. @renelabounek can do either git rebase master && git push -f or git merge master && git push to integrate the two commits from the latest trunk. I think git rebase master is tidier but it wouldn't would do any great harm to click the Update button (which is the same as running git merge master).

git-annex is touchy around PRs since it was never designed to work in a fully distributed fashion, but this PR doesn't run into that issue!

jcohenadad commented 3 years ago

Thank you for the clarifications @kousu. I thought it would be a good habit to never push that button for this repository (in case in the future a PR does include a nifti file), so we have a strict (and simple) procedure to follow, to make sure nobody screws up. But I also understand that we could take it on a case-by-case basis.

About the tidiness and git merge: i assume we don't want to allow "squash and merge" for PRs in this repository? (in case one PR has nifti files, that could create undesired conflicts with the history of git-annex?)

kousu commented 3 years ago

That's a solid bug you found @renelabounek with the spaces instead of tabs. Thanks so much for catching it.

I noticed that your fix for introduced Windows line-endings to the file. You can't see it in Notepad or in https://github.com/spine-generic/data-multi-subject/pull/57/files but you can if you use the command line:

commit e04729dd82f6228bbe8e7c8bbfe43dd50978269c (HEAD -> master)
Author: Rene Labounek <rene.labounek@gmail.com>
Date:   Wed Sep 30 15:36:51 2020 -0500

    Update participants.tsv

    cmrra height and weight added

diff --git a/participants.tsv b/participants.tsv
index 621e5a86..11c3d575 100755
--- a/participants.tsv
+++ b/participants.tsv
@@ -49,12 +49,12 @@ sub-cardiff03       M       33      -       -       2019-03-05      cardiff "CUBRIC, Cardiff University, Wales, UK
 sub-cardiff04  F       31      -       -       2019-03-12      cardiff "CUBRIC, Cardiff University, Wales, UK" Siemens Prisma  HeadNeck_64     VE11C   "G. Tackley, S. Kusmia, R. Wise"
 sub-cardiff05  F       30      -       -       2019-03-14      cardiff "CUBRIC, Cardiff University, Wales, UK" Siemens Prisma  HeadNeck_64     VE11C   "G. Tackley, S. Kusmia, R. Wise"
 sub-cardiff06  M       40      -       -       2019-03-20      cardiff "CUBRIC, Cardiff University, Wales, UK" Siemens Prisma  HeadNeck_64     VE11C   "G. Tackley, S. Kusmia, R. Wise"
-sub-cmrra01    M       24      -       -       2019-02-27      cmrra   University of Minnesota Siemens Prisma-fit      HeadNeck_64     syngo_MR_E11    "R. Labounek, J. Joers, C. Nguyen, I. Nestrasil, C. Lenglet, P-G. Henry"
-sub-cmrra02    M       27      -       -       2019-03-01      cmrra   University of Minnesota Siemens Prisma-fit      HeadNeck_64     syngo_MR_E11    "R. Labounek, J. Joers, C. Nguyen, I. Nestrasil, C. Lenglet, P-G. Henry"
-sub-cmrra03    F       28      -       -       2019-03-08      cmrra   University of Minnesota Siemens Prisma-fit      HeadNeck_64     syngo_MR_E11    "R. Labounek, J. Joers, C. Nguyen, I. Nestrasil, C. Lenglet, P-G. Henry"
-sub-cmrra04    F       22      -       -       2019-03-18      cmrra   University of Minnesota Siemens Prisma-fit      HeadNeck_64     syngo_MR_E11    "R. Labounek, J. Joers, C. Nguyen, I. Nestrasil, C. Lenglet, P-G. Henry"
-sub-cmrra05    F       24      -       -       2019-03-28      cmrra   University of Minnesota Siemens Prisma-fit      HeadNeck_64     syngo_MR_E11    "R. Labounek, J. Joers, C. Nguyen, I. Nestrasil, C. Lenglet, P-G. Henry"
-sub-cmrra06    M       22      -       -       2019-04-02      cmrra   University of Minnesota Siemens Prisma-fit      HeadNeck_64     syngo_MR_E11    "R. Labounek, J. Joers, C. Nguyen, I. Nestrasil, C. Lenglet, P-G. Henry"
+sub-cmrra01    M       24      185     68      2019-02-27      cmrra   University of Minnesota Siemens Prisma-fit      HeadNeck_64     syngo_MR_E11    "R. Labounek, J. Joers, C. Nguyen, I. Nestrasil, C. Lenglet, P-G. Henry"^M
+sub-cmrra02    M       27      188     91      2019-03-01      cmrra   University of Minnesota Siemens Prisma-fit      HeadNeck_64     syngo_MR_E11    "R. Labounek, J. Joers, C. Nguyen, I. Nestrasil, C. Lenglet, P-G. Henry"^M
+sub-cmrra03    F       28      160     64      2019-03-08      cmrra   University of Minnesota Siemens Prisma-fit      HeadNeck_64     syngo_MR_E11    "R. Labounek, J. Joers, C. Nguyen, I. Nestrasil, C. Lenglet, P-G. Henry"^M
+sub-cmrra04    F       22      163     54      2019-03-18      cmrra   University of Minnesota Siemens Prisma-fit      HeadNeck_64     syngo_MR_E11    "R. Labounek, J. Joers, C. Nguyen, I. Nestrasil, C. Lenglet, P-G. Henry"^M
+sub-cmrra05    F       24      152     47      2019-03-28      cmrra   University of Minnesota Siemens Prisma-fit      HeadNeck_64     syngo_MR_E11    "R. Labounek, J. Joers, C. Nguyen, I. Nestrasil, C. Lenglet, P-G. Henry"^M
+sub-cmrra06    M       22      178     77      2019-04-02      cmrra   University of Minnesota Siemens Prisma-fit      HeadNeck_64     syngo_MR_E11    "R. Labounek, J. Joers, C. Nguyen, I. Nestrasil, C. Lenglet, P-G. Henry"^M

The ^M at the end is part of the Windows-style line-ending.

Since this project is used by all three major OSes, it would be good to keep the line endings unix-style, which is the most cross-compatible, especially for data-sciency projects.

Do you have Notepad++ or VSCode or something? They should have a "Convert to Unix Line Endings" option somewhere.


You can help be compatible with unix projects in the future by setting:

git config --global core.autocrlf true

on your system; that'll mean everything is recorded unix-style but is available to you Windows-style when you're working on Windows.

renelabounek commented 3 years ago

@kousu I am using unix as the default system where I normally work. Not sure how my PC would show Windows hidden character ^M. Still, I did all my edits through the github.com web interface. Here I was not able to see the ^M character in the online editor. As I have fixed it only via CTRl+C CTRL+V of the table's existing text through the github online editor (I believe many users can select this choice). I used both spaces or Tabs and nothing worked (can track in previous submitted not working versions).

About Windows users, I think <1% users are using NotePad++ but less more would use the SCT. So it is hard to estimate how many Windows SCT users are using the NotePad++ and if the editor can help to avois this bug. It is quite sure that the author of the original participants.tsv file is the Windows user as the file uses ^M character for column separation.

kousu commented 3 years ago

Ah sorry, I fell into a Github trap. I don't understand why sometimes rebasing branches makes it close (and not allow to reopen) their PRs.

edit it's because I didn't understand my own syntax. I did

$ git push -f -u r master
Enter passphrase for key '/home/kousu/.ssh/id_rsa.github': 
Total 0 (delta 0), reused 0 (delta 0), pack-reused 0
To github.com:renelabounek/data-multi-subject.git
 + 1c48531b4...46df26243 master -> master (forced update)
Branch 'master' set up to track remote branch 'master' from 'r'.

But I should have done

$ git push -f -u r HEAD:master

What I did reset @renelabounek's branch to match ours, and now I can't undo it because, with the PR closed, I no longer have the rights to edit their fork. Ah well.

I've reopened this at #85.

I'm also going to write a .tsv verifier.

kousu commented 3 years ago

And after examining the commits, @renelabounek, I think the ^Ms were introduced by Alex's workstation.