AllenInstitute / AllenSDK

code for reading and processing Allen Institute for Brain Science data
https://allensdk.readthedocs.io/en/latest/
Other
343 stars 149 forks source link

Reproducing published injection site set #14

Closed maedoc closed 8 years ago

maedoc commented 8 years ago

Using only the AllenSDK, we are trying to reproduce the set of injection/target sites used in Oh et al 2014 (and Calabrese et al 2015). It is stated that the selection is made such that site sizes roughly match infection size, but it appears that the only way to reproduce this selection in the ontology is to analyze all the images ourselves.

Have we missed something? Are the nodes in the ontology annotated as to whether they are in the 295 regions used in the Oh et al paper? If not, could this be added to the SDK?

FrancescaMelozzi commented 8 years ago

We noticed that in the site: http://alleninstitute.github.io/AllenSDK/_static/examples/nb/mouse_connectivity.html#Mouse-Connectivity it is written that: On the connectivity atlas web site, you'll see that we show most of our data at a fairly coarse structure level. We did this by creating a "structure set" of ~300 structures. If you want to filter your ontology structures down to that list, you can do this:

from allensdk.api.queries.ontologies_api import OntologiesApi summary_structures = OntologiesApi().get_structures(structure_set_names='Mouse Connectivity - Summary') summary_structure_ids = [ s['id'] for s in summary_structures ] ontology[summary_structure_ids]

However this list contains only the id of 293 structures, and not 295 structure as it is written in Oh et al., 2014.

Moreover if we select only the experiments: -that use not-transgenic C57BL/6J mice (as in the article of Oh et al., 2014) -whose injection site id belongs to the list of the 293 injection sites in the ontology we obtain only 265 injection sites: there are 28 injection sites that belongs to the selected ontology, but that are not injection sites in the experiment list. These 28 areas are:

ID areas 1027 Name: Posterior auditory area ID areas 312782574 Name: Laterointermediate area ID areas 312782628 Name: Postrhinal area ID areas 312782546 Name: Anterior area ID areas 417 Name: Rostrolateral visual area ID areas 589 Name: Taenia tecta ID areas 647 Name: Cortical amygdalar area, posterior part ID areas 982 Name: Fasciola cinerea ID areas 934 Name: Entorhinal area, medial part, ventral zone ID areas 333 Name: Septohippocampal nucleus ID areas 321 Name: Subgeniculate nucleus ID areas 286 Name: Suprachiasmatic nucleus ID areas 338 Name: Subfornical organ ID areas 398 Name: Superior olivary complex ID areas 880 Name: Dorsal tegmental nucleus ID areas 318 Name: Supragenual nucleus ID areas 207 Name: Area postrema ID areas 711 Name: Cuneate nucleus ID areas 642 Name: Nucleus of the trapezoid body ID areas 372 Name: Infracerebellar nucleus ID areas 235 Name: Lateral reticular nucleus ID areas 859 Name: Parasolitary nucleus ID areas 781 Name: Nucleus y ID areas 920 Name: Central lobule ID areas 928 Name: Culmen ID areas 936 Name: Declive (VI) ID areas 1017 Name: Ansiform lobule ID areas 1009 Name: fiber tracts

The code that we use is:

from allensdk.core.mouse_connectivity_cache import MouseConnectivityCache mcc = MouseConnectivityCache(manifest_file='manifest.json') all_experiments = mcc.get_experiments(dataframe=True, cre=False)

from allensdk.api.queries.ontologies_api import OntologiesApi summary_structures = OntologiesApi().get_structures(structure_set_names='Mouse Connectivity - Summary') summary_structure_ids = [ s['id'] for s in summary_structures ]

ist2e = {} for eid in all_experiments.index: if all_experiments.loc[eid,'strain']=='C57BL/6J': for ist in all_experiments.ix[eid]['injection-structures']: isti = ist['id'] if isti in summary_structure_ids: if isti not in ist2e: ist2e[isti]=[] ist2e[isti].append(eid)

print 'number injection sites in the dictionary:', len(ist2e.values())

The list of the target sites used in Oh et al., 2014 is the one that we downloaded as OntologiesApi().get_structures(structure_set_names='Mouse Connectivity - Summary') ? And in this case, why in the experiment lists there are not experiments for all the selected injection sites?

Thank you.

dyf commented 8 years ago

Sorry I didn't respond to this issue sooner.

Briefly, the "Mouse Connectivity - Summary" structure set is what we use for visualization in the connectivity atlas web page. We add/remove structures to/from this set over time based on internal and external feedback. Likewise, we regularly update our structure ontology (usually for refinements).

We have also updated the primary injection structure labels of all of our experiments to correspond to automated quantification (which structure has the most projecting pixels inside of the annotated injection site). Previously this was a manual call made by institute anatomists that occasionally conflicted with the automated calls.

Combined, I believe all of these explain the issues you see. Please let me know if you have any other questions.

dyf commented 8 years ago

One quick comment: the 0.11.0 release has changed the syntax of the structure set query you were using slightly. This:

OntologiesApi().get_structures(structure_set_names='Mouse Connectivity - Summary')

Now requires explicit quotes:

OntologiesApi().get_structures(structure_set_names="'Mouse Connectivity - Summary'")

Sorry for the hassle. This query in particular was inconsistent with all of the other queries in the SDK.

dyf commented 8 years ago

Closing this -- I think the question is answered.