hetio / hetionet

Hetionet: an integrative network of disease
https://neo4j.het.io
257 stars 68 forks source link

Multiple Match Queries Not Working #40

Closed thomasleb0n closed 2 months ago

thomasleb0n commented 3 years ago

Hello,

I have a list of genes and I want to print their immediate connection to the following node labels: Compound, BiologicalProcess, and Molecular Function. I want precisely three nodes for each relation from every gene in the array.

The following command is working for two OPTIONAL MATCH conditions. I have ran the following query to get the Compound and BiologicalProcess relations in NEO4J browser:

UNWIND ["ACVR1B","ACVRL1","ADCY6","AQP2","AQP6","ARF3","C12orf10","C1QL4","CCDC184","CNTN1","COX14","DNAJC22","EIF4B","ENDOU","ESPL1","FAIM2","GALNT6","KIF21A","LMBR1L","MFSD5","NPFF","PCED1B","PDZRN4","PFDN5","POU6F1","PRPH","PRR13","RACGAP1","RAPGEF3","RND1","RPAP3","SLC4A8","SPRYD3","TENC1","TMBIM6","TROAP","TUBA1A","TUBA1B","TUBA1C"] as text
MATCH (g:Gene) WHERE g.name = text
OPTIONAL MATCH (g)--(c:Compound)
WITH g,collect(c.name)[..3] as comp
OPTIONAL MATCH (g)--(b:BiologicalProcess)
RETURN g.name,comp,collect(b.name)[..3] as bio

But now I want to get all three relations with the following script:

UNWIND ["ACVR1B","ACVRL1","ADCY6","AQP2","AQP6","ARF3","C12orf10","C1QL4","CCDC184","CNTN1","COX14","DNAJC22","EIF4B","ENDOU","ESPL1","FAIM2","GALNT6","KIF21A","LMBR1L","MFSD5","NPFF","PCED1B","PDZRN4","PFDN5","POU6F1","PRPH","PRR13","RACGAP1","RAPGEF3","RND1","RPAP3","SLC4A8","SPRYD3","TENC1","TMBIM6","TROAP","TUBA1A","TUBA1B","TUBA1C"] as text
MATCH (g:Gene) WHERE g.name = text
OPTIONAL MATCH (g)--(c:Compound)
WITH g,collect(c.name)[..3] as comp
OPTIONAL MATCH (g)--(b:BiologicalProcess)
WITH g, collect(b.name)[..3] as bio
OPTIONAL MATCH (g)--(m:MolecularFunction)
RETURN g.name,comp,bio,collect(m.name)[..3] as mol

When I run this, the following error occurs:

Neo.ClientError.Statement.SyntaxError: Variable `comp` not defined (line 8, column 15 (offset: 586))
"RETURN g.name,comp,bio,collect(m.name)[..3] as mol"

Anyone know a more efficient way of dealing with this problem? Or understand why the comp variable is suddenly not defined after a third optional match statement?

Thank you in advance!

thomasleb0n commented 3 years ago

Update: I found a temporary fix. I'm not sure exactly how or why it works.

UNWIND ["AAAS","AMIGO2","ANKRD33","ATG101","CCDC65","CELA1","CSRNP2","DAZAP2","DDN","IGFBP6","KCNH3","KRT1","KRT18","KRT2","KRT4","KRT5","KRT6A","KRT6B","KRT7","KRT71","KRT72","KRT73","KRT75","KRT76","KRT77","KRT78","KRT79","KRT8","KRT80","KRT81","KRT82","KRT83","KRT84","KRT86","LETMD1","MAP3K12","SLC38A4","SPATS2","YAF2"] as text
MATCH (g:Gene) WHERE g.name = text
OPTIONAL MATCH (g)--(c:Compound)
WITH g,collect(c.name)[..5] as comp, text
OPTIONAL MATCH (g)--(b:BiologicalProcess)
WITH g, collect(b.name)[..3] as bio, comp, text
OPTIONAL MATCH (g)--(m:MolecularFunction)
RETURN text as Gene,comp as Compound,bio as BiologicalProcess,collect(m.name)[..3] as MolecularFunction
dhimmel commented 3 years ago

Cool thanks for the update. I don't have a ton of time right now to help with the Cypher, but looks like you're on the right track. I've found StackOverflow and the neo4j community forum to be helpful resources in the past for Cypher questions.