Open katameru opened 8 years ago
I sent this query:
{
"time": {
"start": "1h-ago",
"aggregator": "sum"
},
"filters": [
{ "id": "f1", "tags": [] }
],
"metrics": [
{ "id": "m1", "metric": "interface.p2p0.if_packets.tx", "filter": "f1" },
{ "id": "m2", "metric": "interface.p2p0.if_packets.tx", "filter": "f1" },
{ "id": "m3", "metric": "interface.bridge0.if_octets.rx", "filter": "f1" }
],
"expressions": [
{ "id": "e1", "expr": "m1" },
{ "id": "e2", "expr": "m2" },
{ "id": "e3", "expr": "m3" },
{ "id": "e12","expr": "e1+e2" }
]
}
output:
2016-07-06 17:49:33,980 INFO [OpenTSDB I/O Boss #1] ConnectionManager: [id: 0xe287bfdf, /0:0:0:0:0:0:0:1:53604 => /0:0:0:0:0:0:0:1:4242] OPEN
2016-07-06 17:49:33,981 INFO [OpenTSDB I/O Worker #1] ConnectionManager: [id: 0xe287bfdf, /0:0:0:0:0:0:0:1:53604 => /0:0:0:0:0:0:0:1:4242] BOUND: /0:0:0:0:0:0:0:1:4242
2016-07-06 17:49:33,981 INFO [OpenTSDB I/O Worker #1] ConnectionManager: [id: 0xe287bfdf, /0:0:0:0:0:0:0:1:53604 => /0:0:0:0:0:0:0:1:4242] CONNECTED: /0:0:0:0:0:0:0:1:53604
2016-07-06 17:49:34,322 INFO [OpenTSDB I/O Worker #1] QueryStats: Executing new query={"query":{"start":"1h-ago","end":null,"timezone":null,"options":null,"padding":false,"queries":[{"aggregator":"sum","metric":"interface.p2p0.if_packets.tx","tsuids":null,"downsample":null,"rate":false,"filters":[],"index":0,"tags":{},"explicitTags":false,"rateOptions":null,"filterTagKs":null},{"aggregator":"sum","metric":"interface.p2p0.if_packets.tx","tsuids":null,"downsample":null,"rate":false,"filters":[],"index":1,"tags":{},"explicitTags":false,"rateOptions":null,"filterTagKs":null},{"aggregator":"sum","metric":"interface.bridge0.if_octets.rx","tsuids":null,"downsample":null,"rate":false,"filters":[],"index":2,"tags":{},"explicitTags":false,"rateOptions":null,"filterTagKs":null}],"delete":false,"showTSUIDs":false,"msResolution":false,"showQuery":false,"showStats":false,"showSummary":false,"useCalendar":false,"globalAnnotations":false,"noAnnotations":false},"exception":"null","executed":1,"user":null,"stats":{},"requestHeaders":{"authorization":"accessKey:digest:epoch:nonce","Origin":"chrome-extension://fdmmgilgnpjigdojojpjoooidkmcomcm","Cache-Control":"no-cache","Accept":"*/*","Connection":"keep-alive","User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36","Host":"localhost:4242","DNT":"1","Accept-Encoding":"gzip, deflate","Accept-Language":"en-US,en;q=0.8","Content-Length":"562","Content-Type":"text/plain;charset=UTF-8"},"numRunningQueries":1,"httpResponse":null,"queryStartTimestamp":1467845374280,"queryCompletedTimestamp":0,"sentToClient":false}
2016-07-06 17:49:34,349 INFO [AsyncHBase I/O Worker #1] TsdbQuery: TsdbQuery(start_time=1467841774280, end_time=1467845374280, metric=[0, 0, 56], filters=[], rate=false, aggregator=sum, group_bys=()) matched 1 rows in 1 spans in 14.212251ms
2016-07-06 17:49:34,351 INFO [AsyncHBase I/O Worker #1] TsdbQuery: TsdbQuery(start_time=1467841774280, end_time=1467845374280, metric=[0, 0, 56], filters=[], rate=false, aggregator=sum, group_bys=()) matched 1 rows in 1 spans in 13.744573ms
2016-07-06 17:49:34,351 INFO [AsyncHBase I/O Worker #1] TsdbQuery: TsdbQuery(start_time=1467841774280, end_time=1467845374280, metric=[0, 0, 42], filters=[], rate=false, aggregator=sum, group_bys=()) matched 1 rows in 1 spans in 13.740798ms
Got the first Exp:
2016-07-06 17:49:34,381 ERROR [AsyncHBase I/O Worker #1] QueryExecutor: Query exception:
java.lang.NullPointerException: null
at net.opentsdb.tsd.QueryExecutor$1QueriesCB.call(QueryExecutor.java:327) ~[tsdb-2.3.0-RC1.jar:85047ea]
at net.opentsdb.tsd.QueryExecutor$1QueriesCB.call(QueryExecutor.java:251) ~[tsdb-2.3.0-RC1.jar:85047ea]
at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278) [async-1.4.0.jar:na]
then:
2016-07-06 17:49:34,382 WARN [AsyncHBase I/O Worker #1] HttpQuery: [id: 0xe287bfdf, /0:0:0:0:0:0:0:1:53604 => /0:0:0:0:0:0:0:1:4242] Bad Request on /api/query/exp: null
2016-07-06 17:49:34,383 ERROR [AsyncHBase I/O Worker #1] QueryExecutor: Exception thrown during exception handling
java.lang.NullPointerException: null
at net.opentsdb.tsd.HttpSerializer.formatErrorV1(HttpSerializer.java:834) ~[tsdb-2.3.0-RC1.jar:85047ea]
at net.opentsdb.tsd.HttpQuery.badRequest(HttpQuery.java:413) ~[tsdb-2.3.0-RC1.jar:85047ea]
at net.opentsdb.tsd.QueryExecutor$ErrorCB.call(QueryExecutor.java:517) [tsdb-2.3.0-RC1.jar:85047ea]
then:
2016-07-06 17:49:34,384 ERROR [AsyncHBase I/O Worker #1] QueryExecutor: Query exception:
java.lang.NullPointerException: null
at net.opentsdb.tsd.QueryExecutor$ErrorCB.call(QueryExecutor.java:524) ~[tsdb-2.3.0-RC1.jar:85047ea]
at net.opentsdb.tsd.QueryExecutor$ErrorCB.call(QueryExecutor.java:489) ~[tsdb-2.3.0-RC1.jar:85047ea]
at com.stumbleupon.async.Deferred.doCall(Deferred.java:1278) [async-1.4.0.jar:na]
at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1257) [async-1.4.0.jar:na]
at com.stumbleupon.async.Deferred.callback(Deferred.java:1005) [async-1.4.0.jar:na]
then:
2016-07-06 17:49:34,385 WARN [AsyncHBase I/O Worker #1] HttpQuery: [id: 0xe287bfdf, /0:0:0:0:0:0:0:1:53604 => /0:0:0:0:0:0:0:1:4242] Bad Request on /api/query/exp: null
2016-07-06 17:49:34,386 ERROR [AsyncHBase I/O Worker #1] QueryExecutor: Exception thrown during exception handling
java.lang.NullPointerException: null
at net.opentsdb.tsd.HttpSerializer.formatErrorV1(HttpSerializer.java:834) ~[tsdb-2.3.0-RC1.jar:85047ea]
at net.opentsdb.tsd.HttpQuery.badRequest(HttpQuery.java:413) ~[tsdb-2.3.0-RC1.jar:85047ea]
at net.opentsdb.tsd.QueryExecutor$ErrorCB.call(QueryExecutor.java:517) [tsdb-2.3.0-RC1.jar:85047ea]
Using
{ "id": "e12","expr": "m1+m2" }
Works.
The expressions length is 4, which it is, but when it build compile_stack, the last entry is null
2016-07-06 20:41:00,677 DEBUG [AsyncHBase I/O Worker #1] QueryExecutor: Expressions Size is 4
2016-07-06 20:41:00,677 DEBUG [AsyncHBase I/O Worker #1] QueryExecutor: Adding expression to compile_stack[0]
2016-07-06 20:41:00,678 DEBUG [AsyncHBase I/O Worker #1] QueryExecutor: Adding expression to compile_stack[1]
2016-07-06 20:41:00,678 DEBUG [AsyncHBase I/O Worker #1] QueryExecutor: Adding expression to compile_stack[2]
2016-07-06 20:41:00,693 DEBUG [AsyncHBase I/O Worker #1] QueryExecutor: compile stack length: 4
2016-07-06 20:41:00,693 DEBUG [AsyncHBase I/O Worker #1] QueryExecutor: this x is: 3
2016-07-06 20:41:00,693 ERROR [AsyncHBase I/O Worker #1] QueryExecutor: compile_stack[3] is null
2016-07-06 20:41:47,847 DEBUG [AsyncHBase I/O Worker #1] QueryExecutor: this x is: 2
2016-07-06 20:41:47,847 DEBUG [AsyncHBase I/O Worker #1] QueryExecutor: Looking for variable m3
So, the bug is right here, it's not properly iterating the expressions.
https://github.com/OpenTSDB/opentsdb/blob/next/src/tsd/QueryExecutor.java#L321
// compile all of the expressions
final long intersect_start = DateTime.currentTimeMillis();
if (graph != null) {
final ExpressionIterator[] compile_stack =
new ExpressionIterator[expressions.size()];
final TopologicalOrderIterator<String, DefaultEdge> it =
new TopologicalOrderIterator<String, DefaultEdge>(graph);
int i = 0;
while (it.hasNext()) {
compile_stack[i++] = expressions.get(it.next());
}
for (int x = compile_stack.length - 1; x >= 0; x--) {
// look for and add expressions
for (final String var : compile_stack[x].getVariableNames()) {
ExpressionIterator source = expressions.get(var);
if (source != null) {
compile_stack[x].addResults(var, source.getCopy());
LOG.debug("Adding expression " + source.getId() + " to " +
compile_stack[x].getId());
}
}
compile_stack[x].compile();
LOG.debug("Successfully compiled " + compile_stack[x]);
}
} else {
for (final ExpressionIterator ei : expressions.values()) {
ei.compile();
LOG.debug("Successfully compiled " + ei);
}
}
LOG.debug("Finished compilations in " +
(DateTime.currentTimeMillis() - intersect_start) + " ms");
return serialize().addCallback(new CompleteCB()).addErrback(new ErrorCB());
}
}
Well, at least now I have captured there condition and can return a sensible error + 500 status code.
{
"error": {
"code": 500,
"message": " Internal Error: Less expressions where added to the compile stack than expressions.size (3 instead of 4)",
"trace": "java.io.IOException: Internal Error: Less expressions where added to the compile stack than expressions.size (3 instead of 4)
You fix looked good and I had some tweaks that I'm upstreaming now in that branch. Will have this one resolved shortly I hope.
Merged up into the "next" branch. @katameru could you give this a try now please?
This "works on my machine" now also
This is happening on 2.4.0, should we reopen or create a new ticket?
Yeah I'll re-open. @gnydick do you have an example please?
I found that it's when a filter is used, sometimes the
On the latest version of opentsdb (next/HEAD, currently 1409b550ae7f7fcf9c6cbe7e0dbf59c7363618ad), this query causes an exception during execution, and another exception during exception handling so no message is returned.
{
}