geneontology / pathways2GO

Code for converting between BioPAX pathways and Gene Ontology Causal Activity Models (GO-CAM)
8 stars 0 forks source link

Prioritization of Steps in Converting Molecular Events #296

Open ukemi opened 1 year ago

ukemi commented 1 year ago
dustine32 commented 1 year ago

Here is the report on number of molecular events in all Reactome pathway GO-CAMs. This was against the set of models currently loaded (commit d1e90f6) into noctua-dev. 8171 molecular events spread across 1263 pathway models: reactome_all_mol_event_pathways.txt

I added some code to try also capturing any "individual" w/o any class (i.e., no rdf:type other than just OWL:NamedIndividual) but I'm not sure if it works as I couldn't find any GO-CAM node like this to test for. If anyone spots one of these, could you please report the pathway and reaction (or otherwise its individual ID)?

ukemi commented 1 year ago

Thanks @dustine32. That's a long tail of pathways with only a few. Some of them will be covered as we continue the signaling pathways.

ukemi commented 1 year ago

I put a copy of the report in a spreadsheet in the project Google folder.

https://docs.google.com/spreadsheets/d/1wSUrHy-FSxtaaBQFC90kWkmqOMvO2Tp8-Uu2GffBUq4/edit#gid=0

dustine32 commented 1 year ago

For @huaiyumi, here is the list of molecular events by reaction: reactome_all_mol_events.txt

huaiyumi commented 1 year ago

@dustine32 Do you think you can expand this file by adding the molecular events to the file like this one? https://docs.google.com/spreadsheets/d/1WxHvGZzf3tEX3GtCvSMGFRhCev4TaCuJLrPs_whDUNs/edit#gid=0 The data in column C and D are useful for us to curate.

huaiyumi commented 1 year ago

Oh, there are two files. I only downloaded the one reactome_all_moe_event_pathway. I guess the other has everything.

deustp01 commented 1 year ago

@dustine32 Do you think you can expand this file by adding the molecular events to the file like this one?

That file is the one we used to identify events without specific molecularFunctions within the Reactome top-level domain of "signal transduction". A useful way to proceed might be to take the existing file, reactome_all_mol_event_pathways.txt and https://docs.google.com/spreadsheets/d/1wSUrHy-FSxtaaBQFC90kWkmqOMvO2Tp8-Uu2GffBUq4/edit#gid=0, and re-order it so all events that fall into each superpathway are grouped together. My hunch is that each such domain of biology will be enriched in a distinct set of problems (as signaling is in binding events that lack an enabler but for which enablers can often be inferred). If this is right, then the grouping will be really useful in dividing up the large re-annotation task into smaller, more narrowly defined tasks.