Open jdkent opened 3 months ago
Thinking about adding an atlas_description.json
to give general information about the atlas, then using atlas-
<dataset>/derivatives/<pipeline>/
atlas-<label>_desc-<label>_[dseg|probseg|mask].tsv # These will generally be shared across templates/spaces
atlas-<label>_desc-<label>_[dseg|probseg|mask].json # {spatialReference: "orig_space_uri"}
atlas-<label>_space-<spacelabel>_res-<label>_desc-<label>_[dseg|probseg|mask].[nii|dscalar.nii|dlabel.nii|label.gii|.tsv][.gz]
atlas-<label>_space-<spacelabel>_desc-<label>_[dseg|probseg|mask].json # {spatialReference: "spacelabel"}
atlas-<label>_space-<space2label>_res-<label>_desc-<label>_[dseg|probseg|mask].[nii|dscalar.nii|dlabel.nii|label.gii|.tsv][.gz]
atlas-<label>_space-<space2label>_desc-<label>_[dseg|probseg|mask].json # {spatialReference: "space2label"}
atlas-<label>_description.json
<dataset>/derivatives/
atlas-<label>_description.json
atlas-<label>_desc-<label>_[dseg|probseg|mask].tsv # These will generally be shared across subjects
atlas-<label>_space-<space1label>_desc-<label>_[dseg|probseg|mask].json # {"spatialReference": "space1label"}
atlas-<label>_space-<space2label>_desc-<label>_[dseg|probseg|mask].json # {"spatialReference": "space2label"}
sub-01/
func/
sub-01_atlas-<label>_space-<space1label>_[dseg|probseg|mask].[nii|dscalar.nii|dlabel.nii|label.gii|tsv][.gz]
sub-01_atlas-<label>_space-<space2label>_[dseg|probseg|mask].[nii|dscalar.nii|dlabel.nii|label.gii|tsv][.gz]
N/A
the current JSON file has information that assumes the atlas is being expressed in a single space and that is primarily it.
This should be split into two files. One should be an atlas-<label>_description.json
which has immutable information about how the atlas was derived.
The other one should be the atlas-<label>_dseg.json
which has: SpatialReference
and Resolution
defined. These parameters focus on the projection of an atlas to a specific space which could be any space at any resolution, while the original atlas will be a projection of a specific space to a particular resolution (or a set of spaces/resolutions), how an atlas is used in a particular derivative dataset could be in several spaces/resolutions that are not in the originally distributed atlas.
atlas-<label>_description.json
Name | REQUIRED. Name of the atlas |
---|---|
Description | RECOMMENDED. Longform description of the atlas |
Dimensions | RECOMMENDED. Dimensions of the atlas, MUST be 3 (for deterministic atlases) or 4 (for probabilistic atlases). |
4thDimension | OPTIONAL. RECOMMENDED if probabilistic atlas. Should indicate what the 4th dimension entails/refers to, e.g. “Indices”. |
CoordinateReportStrategy | OPTIONAL. MUST BE ONE OF: “peak”, “center_of_mass”, “other”. Indicate the method of coordinate reporting in statistically significant clusters. Could be the “peak” statistical coordinate in the cluster or the “center_of_mass” of the cluster. RECOMMENDED if x, y ,z values are set in the .tsv file. |
Authors | RECOMMENDED. List of the authors involved in the creation of the atlas |
Curators | RECOMMENDED. List of curators who helped make the atlas accessible in a database or dataset |
Funding | RECOMMENDED. The funding source(s) involved in the atlas creation |
License | RECOMMENDED. The license agreement for using the atlas. |
ReferencesAndLinks | RECOMMENDED. A list of relevant references and links pertaining to the atlas. |
Species | RECOMMENDED. The species the atlas was derived from. For example, could be homo sapiens, Macaque, Rat, Mouse, etc. |
DerivedFrom | RECOMMENDED. Indicate what data modality the atlas was derived from, e.g. "cytoarchitecture", "resting-state", "task". |
LevelType | RECOMMENDED. Indicate what analysis level the atlas was derived from, e.g. "group", "individual". |
atlas-<label>_dseg.json |
Name | Description |
---|---|---|
SpatialReference | RECOMMENDED. Point to an existing atlas in a template space (url or relative file path where this file is located). | |
Resolution | RECOMMENDED. Resolution atlas is provided in. |
Okay, so if the general use cases can be broken down into 3 buckets:
Let's test out the 3 use cases
<dataset>/derivatives/<pipeline>/
atlas-<label>_description.json
atlas-<label>_desc-<label>_[dseg|probseg|mask].tsv # These will generally be shared across templates/spaces
atlas-<label>_desc-<label>_[dseg|probseg|mask].json # {spatialReference: "orig_space_uri"}
atlas-<label>_desc-<label>_[dseg|probseg|mask].[nii|dscalar.nii|dlabel.nii|label.gii|.tsv][.gz]
The atlas is only projected onto one template space
<dataset>/derivatives/<pipeline>/
atlas-<label>_description.json
atlas-<label>_desc-<label>_[dseg|probseg|mask].tsv # These will generally be shared across templates/spaces
atlas-<label>_space-<space1label>_desc-<label>_[dseg|probseg|mask].json # {spatialReference: "space1label_uri"}
atlas-<label>_space-<space1label>_desc-<label>_[dseg|probseg|mask].[nii|dscalar.nii|dlabel.nii|label.gii|.tsv][.gz]
atlas-<label>_space-<space2label>_desc-<label>_[dseg|probseg|mask].json # {spatialReference: "space2label_uri"}
atlas-<label>_space-<space2label>_desc-<label>_[dseg|probseg|mask].[nii|dscalar.nii|dlabel.nii|label.gii|.tsv][.gz]
You could just have one not have the space label, but that wouldn't be recommended.
<dataset>/derivatives/<pipeline>/
atlas-<label>_description.json
atlas-<label>_desc-<label>_[dseg|probseg|mask].tsv # These will generally be shared across templates/spaces
atlas-<label>_desc-<label>_[dseg|probseg|mask].json # {spatialReference: "space1label_uri"}
atlas-<label>_desc-<label>_[dseg|probseg|mask].[nii|dscalar.nii|dlabel.nii|label.gii|.tsv][.gz]
atlas-<label>_space-<space2label>_desc-<label>_[dseg|probseg|mask].json # {spatialReference: "space2label_uri"}
atlas-<label>_space-<space2label>_desc-<label>_[dseg|probseg|mask].[nii|dscalar.nii|dlabel.nii|label.gii|.tsv][.gz]
<dataset>/derivatives/
atlas-<label>_description.json
atlas-<label>_desc-<label>_[dseg|probseg|mask].tsv # These will generally be shared across subjects
atlas-<label>_desc-<label>_[dseg|probseg|mask].json
atlas-<label>_desc-<label>_[dseg|probseg|mask].[nii|dscalar.nii|dlabel.nii|label.gii|.tsv][.gz]
sub-01/
func/
sub-01_atlas-<label>_[dseg|probseg|mask].json {"spatialReference": "orig_uri", "resolution": "3mm iso"}
sub-01_atlas-<label>_[dseg|probseg|mask].[nii|dscalar.nii|dlabel.nii|label.gii|tsv][.gz]
The json file at the subject level overwrites the top level json file, so the correct spatialReference and resolution is found. The top level json file here is primarily useful for describing the original atlas and then that information is used to transform the atlas template into subject space, the transformation is applied to the actual atlas file to create the subject specific json. I can imagine use cases where the original atlas is projected onto multiple spaces and at the subject level those spaces are resampled to the subject specific resolution. If people want to re-write tsvs at the top level for specific resolutions or at the subject level to identify missing ROIs, that works as well.
<dataset>/derivatives/
atlas_description.json
sub-01/
anat/
sub-01_atlas-<label>_[dseg|probseg|mask].[nii|dscalar.nii|dlabel.nii|label.gii|tsv][.gz][.ome-tiff|.png]
sub-01_atlas-<label>_desc-<label>_[dseg|probseg|mask].tsv
sub-01_atlas-<label>_desc-<label>_[dseg|probseg|mask].json
sub-01_atlas-<label>_desc-<label>_coordsystem.json
The atlas_description.json could be added at the top level if the method of generation for each subject was the same.
This looks good to me.
Though I think we will want a cohort
or group
entity so we can create atlases from various cohorts, e.g cohort-control
, cohort-young
, cohort-dementia
, etc.
atlas-<label>_description.json
is not currently a BIDS derivatives file, and is partially overlapping
with dataset_description.json
seg-<label>
entity for the atlas
seg-<label>
with inheritance, it requires more complex changes to the validator and spec.atlas-<label>_description.json
fileDear @jdkent,
thank you so much for all the work you have put in to consolidate both standpoints with fresh eyes! 🙏 The OpenneuroPET team which was also one of the drivers behind the atlas BEP highly appreciates that. We are still going through the details, but one of the major things why we wanted an atlas outside of a derivative was that we would be able to validate a standalone atlas data set. So far it is not possible to upload data that only consist of derivatives on OpenNeuro in order to make atlases findable. Could you maybe comment on that already while we go through the details?
Hi @melanieganz ,
I've got another round of edits to make to fully encompass the feedback I got during the meeting, so please hold off trying to understand the mess :laughing:, wrt your comment:
one of the major things why we wanted an atlas outside of a derivative was that we would be able to validate a standalone atlas data set. So far it is not possible to upload data that only consist of derivatives on OpenNeuro in order to make atlases findable.
Talking with @rwblair, it sounds like this is a relatively high-priority item to make a derivative dataset uploadable to openneuro.
Hi James,
The issue can see is that those changes do not allow what was also intended, ie to share an atlas on it's own (or I missed something? - all the examples are /derivatives - re Case 1 )
To give everyone the context, there was a discussion about 1.5 years ago about where 'atlas' should be shared - being as usage or for sharing on it's own. It was 'decided' that sharing at the root level was best because:
(1) an atlas is both a tool (like stuff in code
) and a image (like stuff in sub-
)
(2) moving to root allows sharing new altases on their own
Back to the PR, I'd suggest allowing the folder /atlas
in the root directory as initially proposed and of course still possible in /derivatives
as the output of a computation. Thx
PS: can't share on it's own because if no sub- in root, then not valid
The issue can see is that those changes do not allow what was also intended, ie to share an atlas on it's own (or I missed something? - all the examples are /derivatives - re Case 1 )
That is a good concern, definitely want to keep the ability to share an atlas on its own.
From my understanding of the derivative common principles (see storage of derived datasets, point number 2), one should be able to share a derived dataset as a standalone dataset, enabling the use case of sharing an atlas on its own.
The sub-
being required in BIDS-root is a pesky nuisance, I think for atlases (and other derivatives), it motivates the ability to share without that requirement, I'm not a %100 positive, but I believe that sub-
is not required in derivative datasets, I think the problems with uploads to openneuro are more of a validation (server side), than a problem with the data itself.
will look into that.
Very good point @jdkent we could show both options for case 1, adding in the the dataset_description.json DatasetType. What do you think?
dataset_description.json
DatasetType: 'raw'
<dataset>/derivatives/<pipeline>/atlas/
atlas-<label>_description.json
atlas-<label>_desc-<label>_[dseg|probseg|mask].tsv # constructed from sub- in the dataset - shows how it is build
atlas-<label>_desc-<label>_[dseg|probseg|mask].json # {spatialReference: "orig_space_uri"}
atlas-<label>_desc-<label>_[dseg|probseg|mask].[nii|dscalar.nii|dlabel.nii|label.gii|.tsv][.gz]
<dataset>/sub-001/
<dataset>/sub-002/
dataset_description.json
DatasetType: 'derivative'
<dataset>/atlas/
atlas-<label>_description.json
atlas-<label>_desc-<label>_[dseg|probseg|mask].tsv # this will generally be for sharing
atlas-<label>_desc-<label>_[dseg|probseg|mask].json
atlas-<label>_desc-<label>_[dseg|probseg|mask].[nii|dscalar.nii|dlabel.nii|label.gii|.tsv][.gz]
-- and yes you are right it has consequences on the validation - I will make an example which indeed will remind us to 'promote' atlas-
at the same level as sub-
but a conditional statement is needed in the validator if atlas- then sub- not mandatory
@melanieganz @effigies is there a BIDS-example repo fork we can push stuff for this PR?
If one does not exist, I would suggest just making a bep038 branch on the main repo.
Dear @CPernet and @jdkent,
I just made a branch of the bids-examples according to what @effigies suggested. So we can add stuff there: https://github.com/bids-standard/bids-examples/tree/bep038
Dear @jdkent, @CPernet, @effigies, @pwighton and @oesteban,
as suggested by @effigies I added a bep038 branch to the bids examples and following the example structure that Cyril suggested above I added two examples for atlas001 (raw dataset) and atlas002 (derivative dataset) that are very generic and that we can base our discussion on. I also added an atlas003 example where I tried to base it on the PET atlas that @pwighton has already shared.
Can you please look at them and let me know if that makes sense and if that covers your needs or what changes are necessary?
Also please note, when making the examples in the bids-example repo I discussed the example Cyril had made above with him and he corrected some misspelling there through me.
Hello everyone,
thank you very much for your continued discussion and work on this.
We also started BEP038
examples during the Copenhagen meeting last year here. I think most, if not all of them, are outdated and are thus, for reference.
Thanks again.
Best, Peer
Dear @jdkent ,
we really, really would like to have this move forward. But you anted to fix the examples so they align with your ideas. Can we help with this? Can you point me to the right examples in the text and then I can make examples for the validator to check?
apologies, I am setting some time this week to finish this up.
Hi @jdkent,
thank you! So just that we agree, I can take the examples from the .md you linked to and add them to/modify the existing examples in https://github.com/bids-standard/bids-examples/tree/bep038 and then we are good to check this with @rwblair?
yes, I am in agreement with modifying those examples and then checking in.
@melanieganz I am working on uploading PS13 to templateflow, showing how that tool would encode it. I hit an issue with these examples atlas003/atlas/petsurfer/atlas-ps13_hemi-{R,L}_space-fsaverage_stat-mean_desc-nopvc_mimap.nii.gz
NIfTI files are borderline "valid" to allow hemi- (just because we haven't explicitly forbidden it, but it feels like we should). These should be GIFTIs as they encode surface information.
Other than that, I cannot provide more useful feedback because I don't think the proposal is in good shape. I've been preparing a PR for the last three months and hope to finish it today -- apologies for the delays.
random thoughts for notekeeping
looking at Oscar's suggestions in this comment for changes in the file structure: https://github.com/bids-standard/bids-specification/issues/1281#issuecomment-1921937965
the file structure changes look like an improvement to me (nothing appears to drastically change)
CASE 1:
original
Oscar's proposal
What I like about Oscar's structure is that: 1) the atlas data is in one place (as opposed to being in both the raw and derivative dataset). Are there cases where a raw dataset does not make sense without an accompanying atlas? 2) there is increased alignment with templateflow (which I think would dovetail nicely with a template bep) 3) the important bits of the atlas are maintained at the top level so I'm not lost looking for atlas information. I would only add that for specific templates, one could argue that some roi's/indices/parcels could be resampled out of existence, and perhaps the tsv would want to reflect the reality of the situation.
For findability/searchability there is debate as to whether an atlas or a template should be a first class citizen. Looking at the templateflow website, it is not clear what atlases are available. Adding the tsv/json at the top level will make that much more clear, (and ideally, each atlas could be expressed in each template space, user beware!).
BIDS maintainer proposal (making an atlas as its own derivative dataset expressed in one or more spaces)
This is a concept mapping from tpl being a reference and space being an application of the reference, so the tpl directory becomes an atlas. The order of entities is a little weird, since the folder starts with space, but then atlas becomes a first class citizen again. This use case is for packaging an atlas by itself, the primary use-cases deriving this
BIDS maintainer proposal (making an atlas as its own derivative in one space)
where the spatial reference is defined in the .json file and the space directory is redundant. OR do the same thing as the previous example and include the
space-<spacelabel>
directoryCase 2
original
Oscar's proposal
I also prefer Oscar's suggestion here because: 1) He's not storing the atlas in the raw dataset (same as case 1) 2) Inheritence looks clean
There is the magic of how the atlases got into a particular space, but the
spatialReference
defined in the top level .json
file should identify the original template used by the atlas. For practicality of sharing, I would probably also include the actual atlas image at the top level of derivatives so that people can more easily apply the necessary transformationsJames' proposal
BIDS maintainer proposal (space
If someone has multiple atlases in a dataset, it may get crowded in the top level with the number of files. so placing the files in subdirectories help with the aesthetics.
Case 3:
Original
Oscar's proposal:
No change
This one also looks fine to me!
Overall I find Oscar's suggestions clean, and I'm fine with treating an atlas as a modular derivative (that does not need to reside in a BIDS-raw dataset).
BIDS Maintainer's proposal
Also no changes for this use case.
Case 4 (just referencing an external atlas)