bids-standard / bids-specification

Brain Imaging Data Structure (BIDS) Specification
https://bids-specification.readthedocs.io/
Creative Commons Attribution 4.0 International
272 stars 156 forks source link

HowToAcknowledge not explicit in data type. #372

Closed rwblair closed 4 years ago

rwblair commented 4 years ago

Text for the field in dataset_description.json in the current spec:

"OPTIONAL. Instructions how researchers using this dataset should acknowledge the original authors. This field can also be used to define a publication that should be cited in publications that use the dataset."

bids-standard/bids-validator#772 We had settled on the validator interpreting this field as a string. This breaks an example we have in bids-examples: https://github.com/bids-standard/bids-examples/blob/master/ds000248/dataset_description.json#L10

Should a data type be added for this field, and should it be string, an array of strings, either a string or a an array of strings, or some other type.

From discussion with @effigies.

effigies commented 4 years ago

@bids-standard/raw

sappelhoff commented 4 years ago

the example you link to was prepared by me: I use an array of strings ... each entry in the array is one source to acknowledge / information on how to acknowledge.

I see that this is not very machine readable and that a simple string would be more straight forward. And a string would also be more in line with what is written in the specification.

effigies commented 4 years ago

Well, I don't hear a great clamoring for preserving lists, and if it's just the one example, clarifying that it should be a string makes sense to me.

At least three OpenNeuro datasets will be affected:

$ grep -rI '"HowToAcknowledge":\[' ds*/dataset_description.json
ds000222/dataset_description.json:{"Name":"Sequential Inference VBM","BIDSVersion":"1.0.0rc1","License":"PDDL","Authors":["Thomas H B FitzGerald","Dorothea Haemmerer","Karl J Friston","Shu-Chen Li","Raymond J Dolan"],"HowToAcknowledge":["Citation of FitzGerald et al. Sequential inference as a mode of cognition and its correlates in fronto-parietal and hippocampal brain regions. PLoS Computational Biology (2017)","This data was obtained from the OpenfMRI database. Its accession number is ds000222"]}
ds000235/dataset_description.json:{"Name":"Whole-brain background-suppressed pCASL MRI with 1D-accelerated 3D RARE Stack-Of-Spirals Readout- Dataset 2","BIDSVersion":"1.0.0rc1","License":"PD","Authors":["Marta Vidorreta","Ze Wang","Yulin V. Chang","Maria A. Fernandez-Seara","John A. Detre"],"Acknowledgements":"Tiejun Zhao","HowToAcknowledge":["We  are  grateful  to  Tiejun  Zhao  from  Siemens  Healthcare  for  his  valuable  help  in  the  implementation of the online reconstruction algorithm of the sequence for the Siemens platform.","This data was obtained from the OpenfMRI database. Its accession number is ds000235."],"Funding":"NIH grants no. P41EB015893 and MH080729, and NSFC grant no. 81471644, and Hangzhou Innovation Seed Fund"}
ds000236/dataset_description.json:{"Name":"Whole-brain background-suppressed pCASL MRI with 1D-accelerated 3D RARE Stack-Of-Spirals Readout- Dataset 3","BIDSVersion":"1.0.0rc1","License":"PD","Authors":["Marta Vidorreta","Ze Wang","Yulin V. Chang","Maria A. Fernandez-Seara","John A. Detre"],"Acknowledgements":"Tiejun Zhao","HowToAcknowledge":["We  are  grateful  to  Tiejun  Zhao  from  Siemens  Healthcare  for  his  valuable  help  in  the  implementation of the online reconstruction algorithm of the sequence for the Siemens platform.","This data was obtained from the OpenfMRI database. Its accession number is ds000236."],"Funding":"NIH grants no. P41EB015893 and MH080729, and NSFC grant no. 81471644, and Hangzhou Innovation Seed Fund"}

Each of these was OpenFMRI, so we can go in and manually fix them up.

effigies commented 4 years ago

Well, here's a case of us hitting this in the wild: https://neurostars.org/t/problem-uploading-dataset-to-openneuro-not-a-valid-bids-dataset-error/5532

emdupre commented 4 years ago

My 2c would be to enforce HowToAcknowledge as a string. It's intended to be added to human-centered documentation (manuscripts, etc), where having more than one sentence (as in all the linked cases) shouldn't be a hindrance, I'd think !

teonbrooks commented 4 years ago

I agree with @emdupre that we should enforce it as a string

effigies commented 4 years ago

Sounds like consensus. I'll update those datasets on OpenNeuro. Would somebody like to propose some clarifying language for the spec?

sappelhoff commented 4 years ago

Would somebody like to propose some clarifying language for the spec?

- OPTIONAL. Instructions how researchers using this  dataset should
- acknowledge the original authors. This field can also be used to define
- a publication that should be cited in publications that use the dataset.
+ OPTIONAL. A string of text containing
+ instructions how researchers using this  dataset should
+ acknowledge the original authors. This field can also be used to define
+ a publication that should be cited in publications that use the dataset.
effigies commented 4 years ago

I was looking for other examples, and at least in common derivatives, we state a type with:

REQUIRED. Boolean.

Would it work for people to specify types with:

OPTIONAL. String.

Or maybe:

OPTIONAL. `string`.

This discussion ties into @yarikoptic's suggestion in #350, btw.

sappelhoff commented 4 years ago

I would be +1 to specify the expected variable type like I commented before:

it'd be good to consistently advertise the expected data type in the spec and validate it.

VisLab commented 4 years ago

On a related note: has there been any consideration of an optional field in the description for IRB information or is this something that goes into the How-to-Acknowledge? Most journals require authors to provide IRB information for any data that they use in a publication.

On Thu, Nov 21, 2019 at 12:41 PM Ross Blair notifications@github.com wrote:

Text for the field in dataset_description.json in the current spec:

"OPTIONAL. Instructions how researchers using this dataset should acknowledge the original authors. This field can also be used to define a publication that should be cited in publications that use the dataset."

bids-standard/bids-validator#772 https://github.com/bids-standard/bids-validator/issues/772 We had settled on the validator interpreting this field as a string. This breaks an example we have in bids-examples:

https://github.com/bids-standard/bids-examples/blob/master/ds000248/dataset_description.json#L10

Should a data type be added for this field, and should it be string, an array of strings, either a string or a an array of strings, or some other type.

From discussion with @effigies https://github.com/effigies.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bids-standard/bids-specification/issues/372?email_source=notifications&email_token=AAJCJOUAV4D4J7FILUNKNPTQU3I7FA5CNFSM4JQGRLEKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4H3F5Q6A, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJCJOXZVIW57DAYUXMB4WTQU3I7FANCNFSM4JQGRLEA .

sappelhoff commented 4 years ago

We are now quickly running from one issue into three issues with this thread. Let's divide and conquer.

as for this thread: It was about whether we want to accept STRING only for the HowToAcknowledge field ... and make this clear in both validator and spec.

So far, everybody has agreed that STRING would be good. I suggest to go ahead with this, make the adjustments, and close this issue (continuing the open points in their own issues as linked above).

effigies commented 4 years ago

Sounds good to me.