belbio / bep

BEL Enhancement Proposals
http://bep.bel.bio
Apache License 2.0
7 stars 4 forks source link

Write BEP for partOf relationship #20

Closed cthoyt closed 5 years ago

cthoyt commented 6 years ago

Like the noCorrelation and equivalentTo relationships, a partOf relationship would allow much better expression of the semantics common to other knowledge assemblies. See FamPlex as a good example. Sometimes, partOf has the inverse, but broader semantics as the hasComponent relationship.

We should consider the semantics of partOf and hasPart as described in the Gene Ontology as well:

In BEL 1.0/2.0 this made sense:

complex(FPLX:9_1_1) hasComponent p(HGNC:HUS1)
complex(FPLX:9_1_1) hasComponent p(HGNC:RAD1)
complex(FPLX:9_1_1) hasComponent p(HGNC:RAD9A)

It might be better to write in the other direction:

p(HGNC:HUS1)  partOf complex(FPLX:9_1_1)
p(HGNC:RAD1)  partOf complex(FPLX:9_1_1)
p(HGNC:RAD9A) partOf complex(FPLX:9_1_1) 

Reasoning over part of

Basically, any direct paths containing only partOf and isA imply partOf.

A partOf B and B isA C implies A partOf C

For proteins, this would describe a hierarchy of complexes.

p(X) partOf complex(Y)
complex(Y) isA complex(Z)

# Infer:
p(X) partOf complex(Z)

A partOf B and B partOf C implies A partOf C

For proteins, this would describe a complex formed of other complexes.

p(X) partOf complex(Y)
complex(Y) partOf complex(Z)

# Infer:
p(X) partOf complex(Z)

Does FamPlex have an example for either of these first two headers (@johnbachman)? I will write a script to check the https://github.com/sorgerlab/famplex/blob/master/relations.csv to see.

A isA B and B partOf C implies A partOf C

p(HGNC:PRKAA1) isA p(FPLX:AMPK_alpha)
p(HGNC:PRKAA2) isA p(FPLX:AMPK_alpha)
p(FPLX:AMPK_alpha) partOf p(FPLX:AMPK)

# Infer:
p(HGNC:PRKAA1) partOf p(FPLX:AMPK)
p(HGNC:PRKAA2) partOf p(FPLX:AMPK)

Does this make sense? Should we infer all AMPK complexes have PRKAA1? I don't think that's what FamPlex is trying to say. If p(X1) isA p(X) and p(X2) isA p(X) and p(X) partOf complex(Y) then should we should infer that either X1 or X2 are present in Y?

wshayes commented 6 years ago

I'm thinking I'd like to remove the computed edges from the Specification and instead add that as a guidance on approaches for dealing with computed edges. I'm currently re-working computed edges in BELbio and all of the computed relations start with has* I'm trying to reduce expansion of edges in the graph database by not putting in both directions for every reversible edge - not sure if that is a strategy that will survive long term.

cthoyt commented 6 years ago

Is it possible to remove information from the specification like the definitions of protein families with has* edges?

ncatlett commented 6 years ago

hasComponent is used as both an edge type for curation - i.e., for specifying the components of a named complex, and a computed edge - i.e., for expanding the components of a composed complex.

I don't think we need to include both directions for a relationship/edge in the BEL language, but it is needed for graph traversal. I don't have a strong opinion for which direction we include in the language specification, only a preference for consistency.

BEL also includes subProcessOf, which is a partOf relationship for processes. Is there any reason we can't/shouldn't streamline BEL to use the same partOf relationship for both abundances and processes?

cthoyt commented 5 years ago

Since I've added the PR #30, I'll close this issue. I copied the relevant discussion into that BEP