neuropoly / data-management

Repo that deals with datalad aspects for internal use
4 stars 0 forks source link

`sct-testing-large`: extraneous `"Metadata"` sidecar field #153

Open kousu opened 2 years ago

kousu commented 2 years ago

The .jsons in sct-testing-large have this extraneous "Metadata" field which is not specified by BIDS and which I think doesn't make a lot of sense because everything in a sidecar is metadata.

It also hasn't been used that consistently. Here's a count of the values it takes on:

u108545@joplin:~/sct-testing-large$ find . -name "*.json" | xargs jq -c '.Metadata' | sort | uniq -c | sort -nr
parse error: Expected separator between values at line 46882, column 12
   2318 null
   1912 {"GmModel":false,"MsMapping":false,"Pam50":false}
     34 {"GmModel":true,"MsMapping":false,"Pam50":true}
     30 {"GmModel":true,"MsMapping":false,"Pam50":false}
     29 {"GmModel":false,"MsMapping":true,"Pam50":false}
     26 {"GmModel":false,"MsMapping":false,"Pam50":true}
      3 {"added_by":"Alexandru Foias","added_on":"2021-05-27","contact":"Maryam Seif","URL":"email from 2021-05-26"}
      3 {"added_by":"Alexandru Foias","added_on":"2020-12-14","contact":"tschri7@vt.edu"}
      3 {"added_by":"Alexandru Foias","added_on":"2020-04-01","contact":"Baker, Sarah E","GmModel":false,"MsMapping":false,"Pam50":false}
      2 {"added_by":"Alexandru Foias","added_on":"2021-03-08","contact":"barryrl","URL":"https://forum.spinalcordmri.org/t/best-practices-for-manually-defining-vertebral-levels/645/3"}
      2 {"added_by":"Alexandru Foias","added_on":"2020-05-04","contact":"Mihael Varosanec"}
      2 {"added_by":"Alexandru Foias","added_on":"2020-04-15","contact":"Mihael Varosanec","GmModel":false,"MsMapping":false,"Pam50":false}
      2 {"added_by":"Alexandru Foias","added_on":"2020-03-23","contact":"Baker, Sarah E","GmModel":false,"MsMapping":false,"Pam50":false}
      1 {"added_by":"Julien Cohen-Adad","contact":"Rob Barry","GmModel":false,"MsMapping":false,"Pam50":false}
      1 {"added_by":"Julien Cohen-Adad","contact":"http://forum.spinalcordmri.org/t/weird-results-using-propseg/340","GmModel":false,"MsMapping":false,"Pam50":false}
      1 {"added_by":"Julien Cohen-Adad","added_on":"2020-03-15","contact":"Baker, Sarah E","GmModel":false,"MsMapping":false,"Pam50":false}
      1 {"added_by":"Alexandru Foias","added_on":"2020-12-09","contact":"ctchou"}
      1 {"added_by":"Alexandru Foias","added_on":"2020-05-20","contact":"Achopra"}

Some uses are for provenance tracking, which I think ( https://github.com/neuropoly/data-management/issues/108#issuecomment-980462401) should be stored in git itself.

The "GmModel", "MsMapping" and "Pam50" I'm guessing have something to do with tests for SCT; but if so..shouldn't that information be stored in SCT's tests? We can put the paths to these files into https://github.com/spinalcordtoolbox/spinalcordtoolbox/tree/master/testing somewhere.

kousu commented 2 years ago

It should at least be unnested.