AllenInstitute / AllenSDK

code for reading and processing Allen Institute for Brain Science data
https://allensdk.readthedocs.io/en/latest/
Other
346 stars 151 forks source link

missing region IDS #945

Open lmarcelac opened 5 years ago

lmarcelac commented 5 years ago

I'm trying to map cell counts to the new CCF. I've downloaded the reference-space as well as the structure tree ontology with the IDs and region name, but when I run my mapping, I see region IDs that are not present in the ontology (182305696, 182305712, 312782560, plus others). Is there a way to download all present IDs in the reference space? Is it safe to assume that these are unlabeled regions or is it possible to extract the parent region for these?

NileGraddis commented 5 years ago

HI @lmarcelac

Could you post the exact script that you ran? I did a simple check

>>> from allensdk.core.mouse_connectivity_cache import MouseConnectivityCache
>>> mcc = MouseConnectivityCache(resolution=10)
>>> annot, header = mcc.get_annotation_volume()
>>> 182305696 in annot
False

which seems to suggest that 182305696 is not in the latest ccf annotation volume. It is hard to tell where those values came from without knowing exactly what code you are running

lmarcelac commented 5 years ago

thanks for the quick reply @NileGraddis

Unfortunately, I'm not yet very comfortable with python so I'm sorry for this roundabout way of doing this. Here's what I'm trying:

I dowloaded the annotation as outlined in the example. I need at 25um spacing so I didn't change anything from the example here: https://allensdk.readthedocs.io/en/latest/_static/examples/nb/reference_space.html

I generated a csv with a list of regions and IDs with the following:

import csv

from allensdk.api.queries.ontologies_api import OntologiesApi
from allensdk.core.structure_tree import StructureTree

oapi = OntologiesApi()
structure_graph = oapi.get_structures_with_sets([1]) 

structure_graph = StructureTree.clean_structures(structure_graph)  

tree = StructureTree(structure_graph)

acronym_map = tree.ID_map(lambda x: x['id'], lambda y: y['name'])
w = csv.writer(open("output.csv", "w"))
for key, val in acronym_map.items():
    w.writerow([key, val])

I've then converted the nrrd to a stack of tifs and have imported this along with the csv list to matlab. I wanted to map nonzero pixels in a stack of images to the corresponding stack in the annotation and extract the region IDs. While doing this, I kept getting concat errors which I realized were from missing IDs. I've run the following code to check whether there are IDs in the annotation images that are not in my csv.

IDs=[];

%import table of IDs and change name for script    
table=IDlist;

%% load images
for n=100:529   
    filename_channel_1 = sprintf('annotation_00%d.tif', n);

    I1=imread(filename_channel_1);

%% 

% turn reference image to array
I1array=I1(:);

%append IDs of current image to array of all IDs
IDs=[IDs; I1array];
end

%% sort and unique entries only

IDs=sort(IDs);
single_IDs=unique(IDs);

ID_list=table(:,1);
ID_list=table2array(ID_list);

% find IDs not present in list
for m=1:length(single_IDs)

    if single_IDs(m,1)~= ID_list
        missing=[missing; single_IDs(m,1)];     
    end
end

This is where I get a list of IDs that don't seem to be present in the list I downloaded. Have I missed any IDs in the generation of my csv file?

Thank you!

scott-trinkle commented 4 years ago

I have a similar problem. I am trying to generate masks of subfields within the mouse hippocampus, and certain structures seem to be missing in the annotation. Example below:

from allensdk.core.mouse_connectivity_cache import MouseConnectivityCache mcc = MouseConnectivityCache(resolution=50) tree = mcc.get_structure_tree() ID = tree.get_structures_by_acronym(['CA1slm'])[0]['id'] mask, info = mcc.get_structure_mask(ID)

That returns the following error:

HTTPError: 404 Client Error: Not Found for url: http://download.alleninstitute.org/informatics-archive/current-release/mouse_ccf/annotation/ccf_2017/structure_masks/structure_masks_50/structure_391.nrrd

Seems like all of the masks are missing for the substructures of CA1, CA2, CA3, DG and SUB.

If I manually download an older annotation (http://download.alleninstitute.org/informatics-archive/current-release/mouse_ccf/annotation/mouse_2011/annotation_50.nrrd), I can find some of the structures there.

Any idea what's going on / why some structures were removed from the annotation?

lmarcelac commented 4 years ago

what I eventually realized was that my problem was in the way ImageJ was reading in the IDs from the new annotation (Regions that have new IDs with really large numbers have issues being read in from the 32 bit file). It sounds like you're having a different problem, but I struggled with this for an unreasonably long time so I thought it worth updating.