How to handle supertypes?

Currently we allow paths to use the union supertypes. So if we want a spot in the query to be either a disease or a phenotypic_feature, we use the type disease_or_phenotypic_feature. This will call all the right stuff, and you'll end up with some nodes that are disease and some that are phenotypic_features but they will all have type in the graph of disease_or_phenotypic_feature.

In the case of a cached graph, this is kind of bad, because if you then ask for a subclass, you won't find it (unless your query knows about the biolink-model). We would prefer that the final knowledge graph contains the pushed down node type (disease or phenotypic_feature). This is entirely reasonable and doable.

There is also an idea that the query should handle this by containing a list of types. So instead of saying "disease_or_phenotypic_feature" you would specify that a node could be one of ("disease", "phenotypic_feature"). The thought is that this makes constructing queries easier? I guess I'm not convinced that this is the simplest approach. If "disease_or_phenotypic_feature" had a shorter name, like "yellow", you'd just ask for a "yellow" node.

The other thing to consider is whether we expect biolink-model to get more complicated. The deepest case at the moment is (indention indicates subclass)

biological_process_or_molecular_activity
  biological_process
    pathway
  molecular_activity

So in this case, if you want all of these (which I suggest is the most common case) you either put the top node or a list of 3 things (which you have to know you want).

@kennethmorton @patrickkwang opnions?

NCATS-Gamma / robokop

How to handle supertypes? #109