tsalo commented 3 years ago

https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/bids-discussion/SwH-1KRnBU0/oCx0ynEpBAAJ

Our group would also be interested in finding a way to incorporate a way of distinguishing between sites for the datasets we work with. I personally would prefer to use the key 'site-' rather than 'centre-', as 'centre' is not a shared spelling between British and American English.

Maybe overall, all the participants from a single site could also live within a directory for that site. So the directory structure might be amended to look like:

/site-<site_label>/sub-<participant_label>/[ses-<session_label>/]

Original authors: @jpellman

tsalo commented 3 years ago

@thomasbeaudry wrote:

why not add the site to the JSON file?

tsalo commented 3 years ago

@Athanasiamo wrote:

why not add the site to the JSON file?

I have multi-site data and found Chris' advice to very much fit our data and the way our researchers use it. Though, we are not a "true" multi-site study, I believe this is a very nice solution (though produces long file names): https://groups.google.com/forum/#!topic/bids-discussion/WV-weqTusNQ

tsalo commented 3 years ago

@alexandreroutier wrote:

why not add the site to the JSON file?

I currently have to work with a multi-site study and in my experience, the simplest and quite efficient solution was to embed the site into the e.g. sub-01CLNC001 where "01" is the site id and "CLNC001" the participant label and I am quite satisfied with that.

However, the only drawback I see in this approach is that it assumes that the participant does not change site during the study. In that case, we can consider to embed the site at the session.

tsalo commented 3 years ago

@chrisgorgo wrote:

This is also one of the recommendation from the main spec. See https://docs.google.com/document/d/1HFUkAEE-pB-angVcYe6pf_-fVf4sCpOHKesUvfb8Grc/edit#heading=h.29tn5cduh4ci

tsalo commented 3 years ago

@TKoscik wrote:

from my point of view, site is a critical piece of information that generally denotes batch effects within a particular project. For example, in the ABCD protocol there will be batch effects between sites and scanners and scanner software revisions, etc.

Also, a site variable varies independently from subject and session, so lumping this information with other variables seems incorrect.

having a separate folder for each site is also seems to place this variable at the wrong level of analysis as it is a within-project, and potentially within-subjects variable. not to mention that changing the folder structure will impact scripts more than a filename change.

Currently we are using the site tag to denote a combination of site, vendor, and relevant scanner/software changes, the benefit is that this makes immediately visiblethe need to explore/correct for batch effects. currentlly we are using a 5 digit code:

first 2 digits, identify institute, e.g., UIHC = 00
3rd digit identifies the scanner vendor, e.g., 1=Siemens, 2=GE, 3=Phillips
last two digits, identify scanner and software version (and other major changes on each scanner that might cause batch effects in images), e.g., initial scanner setup for a given scanner is 00, every relevant change increments this number.
e.g., filename = sub-123_ses-12345sixsite-00201*

tsalo commented 3 years ago

@HenkMutsaerts wrote:

I personally would prefer to use the key 'site-' rather than 'centre-', as 'centre' is not a shared spelling between British and American English.

I agree that this is a nice contribution to BIDS. However, in a multi-site study, 1 site can still have multiple scanners, and in most image processing you want to correct per scanner. So instead of 'site', you could consider 'scanner'

tsalo commented 3 years ago

@pvdemael wrote:

I agree that this is a nice contribution to BIDS. However, in a multi-site study, 1 site can still have multiple scanners, and in most image processing you want to correct per scanner. So instead of 'site', you could consider 'scanner'

This can easily be added to the participants file by adding columns for sites and scanners. IMHO is the scanner very important information but not to be in the filename which is more oriented towards features of the image

jbteves commented 3 years ago

My view is that the subject ID should generally encapsulate the site anyway, as discussed above. Additionally, the scanner could be embedded as part of the metadata.

drmowinckels commented 3 years ago

Hi.

We have ended up using something similar to what @tsalo desribes. We ended up incorporating site into the session tags, because we have subjects that have participated in multiple sites over time.

For us now, the session tag numerals counts the sequential scans, while the alphabeticals indicate the site ex. sub-1200131_ses-01siteScanner And the site-information is a combination of site acronym and scanner

We found this solution to be the best also because of our longitudinal data. For our specific data, we have decided to change the meaning of "session" from standard MRI terms. Having the numerals in the session to mean "lognitudinal timepoint", rather than scan session. This was necessary to make a system that made it easy for our staff to recognise what cognitive data fits with which scan data. This was extra important also because some of our participants have within the same lognitudinal timepoint visited several scanners in a single day, so we could try to estimate the error in measurements when participants switch from one scanner to another due to upgrades.

So we could have

sub-1200131_ses-01scannerOne
sub-1200131_ses-02scannerOne
sub-1200131_ses-02scannerTwo
sub-1200131_ses-03scannerTwo

yarikoptic commented 12 months ago

54 could provide a generic way to support overall desired layout for this. The issue could be split into two:

recommend adding site entity. Can be done right now as a PR. This would allow for annotation but without changes to layout.
54

yarikoptic commented 3 months ago

Also somewhat relates to https://bids.neuroimaging.io/bep035 where the idea is to aggregate across studies, so adds entity study. But probably semantic is different enough which would warrant to have both study and site.

yarikoptic commented 3 months ago

other pieces of feedback:

site is IMHO better than scanner since not MR/CT/...-specific
site is better than center since the "site" is more generic than "center.
indeed site can be incorporated either into sub or into ses depending on the use case, and its utility remains not highly demanded ATM IMHO. At large there is already reliance on some metadata field to tell apart different acquisition equipment "samples", versions of their software, etc . But they are largely spread out and not formalized. I see formalization of site as "data acquisition site" as a combination which would give us sites.tsv and sites.json to aggregate/summarize (see #65) metadata specific to the data acquisition sites.
there was a promise stated in BIDS 1.0 about "multi-site" support since "the initial markdown" https://github.com/bids-standard/bids-specification/commit/a364add49e106388338a4dd980aad6f9a4e45eea#diff-82b82851a7dbfdcaec1452e9b42e7d12522fe6caec85998bc85e2c7cf3b341afR2477 -- so we better "deliver"
IMHO there is no point to recommend addition of _site- entity outside of the BIDS 2.0 since there is no "the level" site should be used - could be over subjects (site-/sub-) or under subjects (sub-/site- in particular e.g. for "traveling human phantom" etc), so at large depends on having
- 54
edit: I would consider site entity as OPTIONAL and likely after subject entity in the ordering (we can have multiple sessions within site but not multiple sites within session for the same subject. There could be multiple subjects within a session across sites though in some "hyperscanning" etc studies). But as "OPTIONAL" it would likely (we are yet to formalize) require "manual" specification of the placement within #54 solution.

bids-standard / bids-2-devel

Multi-site/center studies #11

54 could provide a generic way to support overall desired layout for this. The issue could be split into two:

54

54