superphy / prairiedog

next-gen pangenome graphs for predictive genomics
Other
0 stars 0 forks source link

"uid_in" queries are slow #145

Open kevinkle opened 4 years ago

kevinkle commented 4 years ago
2019-08-22 20:01:46 30696b53a772 prairiedog[1] DEBUG Using query:

        {
            q(func: has(fd)) @filter(uid_in(fd, 0x6b9314)) {
                uid
            }
        }

2019-08-22 20:07:01 30696b53a772 prairiedog[1] DEBUG Got res:
json: "{\"q\":[{\"uid\":\"0xdb96e\"},{\"uid\":\"0x48a0e9\"},{\"uid\":\"0x8adae1\"},{\"uid\":\"0xe446d3\"},{\"uid\":\"0x10c22ea\"},{\"uid\":\"0x10d847a\"},{\"uid\":\"0x1101530\"}]}"
txn {
  start_ts: 30003
}
latency {
  parsing_ns: 10845
  processing_ns: 314512391781
  encoding_ns: 758718
}

 of type <class 'api_pb2.Response'>
2019-08-22 20:07:01 30696b53a772 prairiedog[1] DEBUG Decoded as:
{'q': [{'uid': '0xdb96e'}, {'uid': '0x48a0e9'}, {'uid': '0x8adae1'}, {'uid': '0xe446d3'}, {'uid': '0x10c22ea'}, {'uid': '0x10d847a'}, {'uid': '0x1101530'}]}
kevinkle commented 4 years ago

Well, pyinstrument needs to be setup a bit more....

2019-08-22 22:40:27 486e99c79a12 prairiedog[1] INFO Profiler output:

  _     ._   __/__   _ _  _  _ _/_   Recorded: 21:50:23  Samples:  149
 /_//_/// /_\ / //_// / //_'/ //     Duration: 3004.082  CPU time: 1.403
/   _/                      v3.0.3

Program: /opt/venv/bin/prairiedog --debug --profiler query GACTACATAAA AACCTCCGGCT

3004.081 invoke  click/core.py:518
└─ 2998.072 query  prairiedog/cli.py:81
      [392 frames hidden]  prairiedog, pydgraph, grpc, threading...
         2229.320 _blocking  grpc/_channel.py:538
         └─ 2229.317 [self]
         741.442 _blocking  grpc/_channel.py:538
kevinkle commented 4 years ago

The main thing would be to find a better way of doing Dgraph.find_edges_reverse()

kevinkle commented 4 years ago

We can prob get what we want by defining a reverse edge <er> or something, for every edge. Then we can just query at normal speed

kevinkle commented 4 years ago

Didn't mean to close the issue. This will have to be tested at a future date