geneontology / helpdesk

The Gene Ontology Helpdesk
http://help.geneontology.org
16 stars 6 forks source link

Using the GO/GO-CAM API fetch all genes that occur in each cell type in C. elegans adults #279

Closed Munfred closed 3 years ago

Munfred commented 3 years ago

I understand that GO-CAM goes beyond GO in that it properly ties together relationships between genes and other entities, like developmental stage (which I think would be a subclass of the biological phase entity) and cell types where it occurs (which I think would be a subclass of the location entity)

I would like to know two things:

1) Is it possible to query the GO-CAM API of C. elegans to fetch all genes that occur in adults that have a specific cell type as their location? 2) How can I use the API to fetch the taxonomy of cell types for adult C. elegans?

The reason I want to do this is because I’m helping a group analyze some nuclei data of adult C. elegans. It is challenging because there is no other directly comparable dataset, so the cell type annotations have to be done from scratch based only on cell type marker genes. Additionally certain cell types might not be represented, so it is best to have a multi-level taxonomy instead of a flat one, because the coarser level labels will be assigned with more certainty than the more granular labels.

I was thinking is that it would be fantastic if it would be possible to use GO-CAM to programmatically build such a taxonomy with the markers, instead of manually defining each cell type markers. Then in addition to being reproducible, such a taxonomy would also be self updating as GO-CAM annotations improve!

Thank you

kltm commented 3 years ago

Currently, I believe the bulk of the data returned from the API is coming (indirectly) from our ontology and GAFs--the GAFs contain GO-CAM model information in a roundabout way. I believe the most complete way of getting at this information as we currently offer it might be to use the data files directly and examine the "extensions" field (http://geneontology.org/docs/go-annotation-file-gaf-format-2.1/#annotation-extension-column-16). That said, it may be a lack of imagination or knowledge on my part, so let me tag some others onto this thread.

Tagging @vanaukenk @lpalbou @pgaudet

vanaukenk commented 3 years ago

Hi @Munfred

At the moment, the C. elegans GO-CAMs are not the best source for the kind of expression information you're looking for. We are still in the process of converting our 'conventional' C. elegans GO annotations into GO-CAMs, but they probably won't be in production for a few months yet. Also, we have not systematically captured the cell type and stage information in GO CC annotation yet.

At the moment, the WB expression pattern dataset would be the better source. You would probably need to xref the anatomy and development association files for specific Expr objects. The files are available on the ftp site: ftp://ftp.wormbase.org/pub/wormbase/releases/current-production-release/ONTOLOGY/

Munfred commented 3 years ago

I see. Thanks!