bids-standard / bids-2-devel

Discussions and suggestions of backwards incompatible changes to BIDS
https://bids.neuroimaging.io/
Creative Commons Attribution 4.0 International
10 stars 1 forks source link

Extended BIDS for animals #38

Closed tsalo closed 2 months ago

tsalo commented 3 years ago

Could we have sub- changed to the species e.g. in http://datashare.is.ed.ac.uk/handle/10283/2122 we used rat instead of sub (rat- mice- cat- dog- there is more and more stuff out there now)

when using mutant we could either have it in the participant description or right away in the name, e.g. rat-SD-Bdnftm1sage_seq-SE_task_flashing-light_bold.nii.gz

Original authors: Unknown

tsalo commented 3 years ago

@TheChymera wrote:

Could we have sub- changed to the species e.g. in http://datashare.is.ed.ac.uk/handle/10283/2122 we used rat instead of sub (rat- mice- cat- dog- there is more and more stuff out there now)

I would advise against this. I work with animals myself, and each animal is a subject. Sub is perfectly fine.

tsalo commented 3 years ago

@CPernet wrote:

I would advise against this. I work with animals myself, and each animal is a subject. Sub is perfectly fine.

yes and no, sub is perfectly fine indeed but then in a large database it makes things easy to download / analyze based on the name

tsalo commented 3 years ago

@jcolomb wrote:

yes and no, sub is perfectly fine indeed but then in a large database it makes things easy to download / analyze based on the name

I would also say this information should not end in the folder name, it will make development of tool species specific... a specific type of metadata would do the trick (and be back-compatible (?) )

tsalo commented 3 years ago

@TheChymera wrote:

when using mutant we could either have it in the participant description or right away in the name, e.g. rat-SD-Bdnftm1sage_seq-SE_task_flashing-light_bold.nii.gz

The difference between mutants and genotypes is merely nominal. Increasingly, double and triple transgenes are used in mouse studies. Soon enough multiple genotypes will be tracked in species without inbred strains (incl humans). This addition simply doesn't scale. Not least of all, Bdnf isn't a very descriptive mutant name. Is it a knockout? Point mutation? Variant? and then, is that homozygous? hetero? Ideally we should be weary of BIDS tracking too much info, lest it becomes encumbered. If necessary, I would propose tracking such info in a per-subject JSON.

tsalo commented 3 years ago

@CPernet wrote:

The difference between mutants and genotypes is merely nominal. Increasingly, double and triple transgenes are used in mouse studies. Soon enough multiple genotypes will be tracked in species without inbred strains (incl humans). This addition simply doesn't scale. Not least of all, Bdnf isn't a very descriptive mutant name. Is it a knockout? Point mutation? Variant? and then, is that homozygous? hetero? Ideally we should be weary of BIDS tracking too much info, lest it becomes encumbered. If necessary, I would propose tracking such info in a per-subject JSON.

yes agree we can move it in the JSON - do you have references / standard on how to ref animals ; I know there is some work done on using DOI to identify them ; might be worth having this as well if possible

TheChymera commented 3 years ago

yes agree we can move it in the JSON - do you have references / standard on how to ref animals ; I know there is some work done on using DOI to identify them ; might be worth having this as well if possible

This is very much an open question, we would need input from somebody who is more thoroughly active in the cutting-edge of genetics. Genome sequencing is not ubiquitous enough for us to be able to link the genome via a DOI. In animals for which we use inbred strains, a possible solution would be to have a string field for the strain, and a list of strings field for the tested genetic variants, e.g.

strain: 'C57BL/6'
gene_variants: ['DAT-IRES-Cre','ePet-Cre']

This is of course, just a dummy example; in truth gene variants have highly complex naming schemes (see these links for the above two: https://www.jax.org/strain/006660 , https://www.jax.org/strain/012712 ), with plenty of superscript levels. We would first need to find out what the best standard for unambiguous identification of gene variants and insertion sites is, and just use that. Ideally this would also be a standard which doesn't use superscript — while I do assume we all use unicode by now, there's lots to be said for legibility and ease of manipulation in plain text.

jcolomb commented 3 years ago

Please, have a look at reagents.io .

In brief: for reagents, the best PID are called RRID, but while some tried to use it for animal strains, specialists are calling against it, but would better see a system using the PID of the databases (species specific database for now, but see the encode project: https://www.encodeproject.org)

for rodents, there are MGI numbers. I do not know if one can get a MGI number for a strain before publication or if only published strains/genetic alteration get one. For example, for the first example above ( https://www.jax.org/strain/006660), the correct PID would be http://www.informatics.jax.org/allele/genoview/MGI:3689567. (the first link goes to the specific strain available at jax, the second could link to several mice providers.)

It is quite complexe to find and few researchers would get it right...

SylvainTakerkart commented 3 years ago

Hi all, thanks @jcolomb to mention the neuroscience-data-structure project

We have indeed launched a dedicated focus group (which will soon appear as a Special Interest Group of the INCF) to try to standardize the structuring of data and metadata recorded in animal models, with any modalities (from electrophysiology to calcium imaging, via MRI)... Several ongoing initiatives are trying to address this question, and we'd like to see how to accommodate them to reach a wide adoption by the community. One of the questions that we aim at addressing is of course "would it make sense to do this within BIDS, or are there enough specifics with animal data that forces us to go outside of BIDS", and we of course would like to address this question with as much feedback as possible from the community...

Anyhow, if you are interested in this endeavor, do not hesitate to get in touch with me and/or to participate in the discussions that we started in the Issues of this project: https://github.com/INCF/neuroscience-data-structure

Cheers,

Sylvain

SylvainTakerkart commented 3 years ago

@TheChymera wrote:

Could we have sub- changed to the species e.g. in http://datashare.is.ed.ac.uk/handle/10283/2122 we used rat instead of sub (rat- mice- cat- dog- there is more and more stuff out there now)

I would advise against this. I work with animals myself, and each animal is a subject. Sub is perfectly fine.

in parallel with the wider neuroscience-data-structure project, we've tried to come up with some internal rules within our 150-people institute (see our current state of affairs here: https://int-nit.github.io/AnDOChecker/ ; it's not BIDS-compatible, but not too far... comments welcome!!! ;) )... we chose to stick with sub-XXX, and we're pushing towards XXX being some kind of GUID; for animals, our GUID remains human-readable, and we we have one letter dedicated to specifying the species of the animal in our GUID...

yarikoptic commented 2 months ago

As to me this proposal goes against common principle of BIDS of having entity-{value} composition in the filenames etc, but taking value of a potential entity (species) and making it into an "entity". Also demand seems to be simply not there, so I would simply close this one. If someone wants to resurrect -- please chime in, or do so.

TheChymera commented 2 months ago

Not resurrecting, just updating, since the original rat-{value} proposal was a quote from me. I have long since moved to sub-{value}. This is no longer a current proposal on my part.

yarikoptic commented 2 months ago

Do you mean we should place you in-place of the

Original authors: Unknown

in the description of this issue?

TheChymera commented 2 months ago

Not sure if I'm the original originator. Just wanted to clarify that there's nobody remaining in support of this now, since the only quote in its support was from me → https://github.com/bids-standard/bids-2-devel/issues/38#issuecomment-668047141

CPernet commented 2 months ago

the authors of the datasets are mentioned - at the time in 2016 when I was in Edinburgh I helped/pointed Xenios Milidonis to BIDS, he made that dataset