NCATS-Tangerine / beacon-aggregator

A web service that operates over the Beacon network to provide a single software interface over the all the Beacons
Other
2 stars 0 forks source link

Discovery is inexplicably slow #87

Open lhannest opened 5 years ago

lhannest commented 5 years ago

I posted a concept query with keywords=["phone"]. I left it alone for a few minutes, and coming back still no data was discovered: https://kba.ncats.io/concepts/status/E5wH3pLwZ3AM3oE1JTEl. Then I went to one of the beacons and tried the same query (https://kba.ncats.io/beacon/rkb/concepts?keywords=phone&size=1000) and data came back almost immediately.

lhannest commented 5 years ago

Running locally doesn't have this problem even on the same beacons. Maybe the application running on the server is out of date?

lhannest commented 5 years ago

There's no error in the server logs, that's worrying:

kba           | 2018-11-26 16:22:03.805  INFO 1 --- [nio-8080-exec-7] o.n.o.drivers.http.request.HttpRequest   : Thread: 42, url: http://blackboard:7474/db/data/transaction/commit, request: {"statements":[{"statement":"MATCH (n:`QueryTracker`) WHERE n.`queryString` = { `queryString_0` } WITH n RETURN n, ID(n)","parameters":{"queryString_0":"concepts:[phone];[];"},"resultDataContents":["graph","row"],"includeStats":false}]}
kba           | 2018-11-26 16:22:03.817  INFO 1 --- [nio-8080-exec-7] o.n.o.drivers.http.request.HttpRequest   : Thread: 42, url: http://blackboard:7474/db/data/transaction/582, request: {"statements":[{"statement":"UNWIND {rows} as row CREATE (n:`DatabaseEntity`:`QueryTracker`) SET n=row.props RETURN row.nodeRef as ref, ID(n) as id, row.type as type","parameters":{"rows":[{"nodeRef":-245,"type":"node","props":{"beaconsHarvested":[1,2,3,4,5,6,7],"queryString":"concepts:[phone];[];","version":1,"versionDate":1543249323813}}]},"resultDataContents":["row"],"includeStats":false}]}
kba           | 2018-11-26 16:31:45.771  INFO 1 --- [nio-8080-exec-8] o.n.o.drivers.http.request.HttpRequest   : Thread: 43, url: http://blackboard:7474/db/data/transaction/commit, request: {"statements":[{"statement":"MATCH path=(clique:ConceptClique)<-[:MEMBER_OF]-(concept:Concept)  WITH  \tSIZE(FILTER(x IN {filter} WHERE REPLACE(LOWER(concept.name),'-',' ') CONTAINS LOWER(x))) AS name_match,  \tSIZE(FILTER(x IN {filter} WHERE REPLACE(LOWER(concept.definition),'-',' ') CONTAINS LOWER(x))) AS def_match,  \tSIZE(FILTER(x IN {filter} WHERE ANY(s IN concept.synonyms WHERE REPLACE(LOWER(s),'-',' ') CONTAINS LOWER(x)))) AS syn_match, \tpath AS path  WHERE name_match > 0 OR def_match > 0 OR syn_match > 0 AND  (  \t{categories} IS NULL OR SIZE({categories}) = 0 OR \tANY(a IN {categories} WHERE ANY(b IN concept.categories WHERE TOLOWER(a) = TOLOWER(b))) )  RETURN path  ORDER BY name_match DESC, syn_match DESC  SKIP  ({pageNumber} - 1) * {pageSize}  LIMIT {pageSize}","parameters":{"filter":["phone"],"pageSize":10,"pageNumber":1,"categories":[]},"resultDataContents":["graph"],"includeStats":false}]}

Noticed that the server version isn't very up to date, updated it and tried it out with this concept query:

{
  "queryId": "GHzWxS9Som7azhx2C5wd",
  "keywords": [
    "phone"
  ],
  "categories": []
}

And things appear to be working properly now: https://kba.ncats.io/concepts/status/GHzWxS9Som7azhx2C5wd

lhannest commented 5 years ago

Bug has come back. KBA appears to have been using a massive amount of the CPU:

top - 18:41:14 up 225 days, 16:06,  2 users,  load average: 9.09, 9.06, 8.96
Tasks: 255 total,   2 running, 253 sleeping,   0 stopped,   0 zombie
%Cpu(s): 56.5 us,  0.3 sy,  0.0 ni, 43.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 65963760 total, 13862524 free, 31716400 used, 20384836 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 33255776 avail Mem 

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                            
 18074 root      20   0 18.056g 7.811g  11040 S 796.0 12.4  45067:57 java 

I stopped and restarted the container and top no longer reported this CPU usage, and the bug once again disappeared.

lhannest commented 5 years ago

The issue appears to be that clique building runs the maximum number of threads. On the server the number of threads seems to max out at 48, and any queries I try to initiate after that do not do anything.