Update omnicorp build to handle pubchem/chebi better.

Consider this query:

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "id": "UniProtKB:P52788",
                    "category": "biolink:Gene"
                },
                "n1": {
                    "category": "biolink:ChemicalSubstance"
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1"
                }
            }
        }
    }
}

Currently, this returns a bunch of chemicals, normalized to pubchem ids. Omnicorp knows about pubchem ids, but I guess because the names are different in pubchem there are lots of cases where we have results in omnicorp for chebi but not for pubchem.

Originally, I was thinking that omnicorp overlay should look in the equivalent identifiers on the input graph and query the cache/postgres for those identifiers as well.

But I think that's wrong - first, you only get back counts, so if you get results for 2 equivalent identifiers, there's no good way to combine them or decide between them. Second, it makes a lot of (probably repeated) double querying. Now I think we should resolve this upstream when we build the omnicorp database and cache. All that we need to do is normalize identifiers where we still have the actual pubmed ids so that we can combine things.

The downside of this approach is that it will tie the cache to the normalization and biolink prefix ordering.

ranking-agent / aragorn-ranker

Update omnicorp build to handle pubchem/chebi better. #19