Open avelar-ageing opened 6 months ago
@DeniseSl22, didn't we write a SPARQL query for this at some point in time? Or was that just on my long wish-/todo list?
The pathway WP5424
is not in the RDF yet, but the following SPARQL should give you some idea how to do this:
SELECT ?wpid ?catalyst ?source ?target WHERE {
?pathway a wp:Pathway ;
dc:identifier / dcterms:identifier ?wpid .
?catalysis a wp:Catalysis ;
dcterms:isPartOf ?pathway ;
wp:source / rdfs:label ?catalyst ;
wp:participants ?reaction .
?reaction a wp:Interaction .
OPTIONAL { ?reaction wp:source ?source }
OPTIONAL { ?reaction wp:target ?target }
} ORDER BY ASC(?catalysis)
@avelar-ageing , thanks for your question! I've modified the query of @egonw slightly, see below.
I believe that the reactions without a clear source and/or target are not relevant in this case (and require some curation on our side). There are also a bunch of interactions between two metabolites which have not been drawn with the MIM-Catalysis interaction type, but with a regular arrow. I've reworked that line in the SPARQL query (see below), so you can comment it out to see the difference in response (# is used for comments in SPARQL). When only including interactions of type MIM:Catalysis, you would receive 5296 results; if commenting out this line, you get 6189 results (so ~900 more). I've also added a way to unify to one database type (Wikidata, others are possible, e.g. HMDB, ChEBI, PubChem) for the metabolite annotations, in case you would want to merge the data at a later stage. Unifying the enzyme annotations can be done in a similar matter (to HGNC, Ensembl, UniProt, etc.)
Also note that this is for all pathway (WikiPathways and Reactome) and all species. Hope the above helps, if not ask another question here.
SELECT DISTINCT ?wpid ?catalyst ?source ?sourceDb ?target ?targetDb WHERE {
?pathway a wp:Pathway ;
dc:identifier / dcterms:identifier ?wpid .
# ?catalysis a wp:Catalysis .
?catalysis dcterms:isPartOf ?pathway ;
wp:source / rdfs:label ?catalyst ;
wp:participants ?reaction .
?reaction a wp:Interaction .
?reaction wp:source ?source .
?source a wp:Metabolite .
OPTIONAL{?source wp:bdbWikidata ?sourceDb .}
?reaction wp:target ?target .
?target a wp:Metabolite .
OPTIONAL{?target wp:bdbWikidata ?targetDb .}
} ORDER BY ASC(?source)
I am interested in downloading metabolic enzymes from pathways. For example in the omega3 senescence pathway (https://www.wikipathways.org/pathways/WP5424.html) there are various genes that are not directly linked to metabolism, including p21. I think it it should be possible to identify metabolism genes using all genes involved in conversion MIM interactions? Is there a method of just extracting these genes as opposed to all genes in the pathway using the R package?
Thanks