Open maggienj opened 7 years ago
err msg is... "type":"illegal_argument_exception","reason":"Extraneous conditional statement.
one other msg... "type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards": ( this err msg may not be applicable, because we do see shards available and the es cluster is up and running... still, not sure why this msg would pop up )
caused by
caused by
ERROR: Internal Server Error: {"error":{"root_cause":[{"type":"script_exception","reason":"compile error","script_stack":["(!((false)?(doc[\"a\"].values. ..."," ^---- HERE"],"script":"(!((false)?(doc[\"a\"].values.size()==0):((doc[\"a\"].values).contains(\"e\")))) ? (1) : (0)","lang":"painless"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"testing_000_n20170713_200436","node":"OX5gNnXQRtW6gThYBtxJ4w","reason":{"type":"script_exception","reason":"compile error","caused_by":{"type":"illegal_argument_exception","reason":"Extraneous conditional statement."},"script_stack":["(!((false)?(doc[\"a\"].values. ..."," ^---- HERE"],"script":"(!((false)?(doc[\"a\"].values.size()==0):((doc[\"a\"].values).contains(\"e\")))) ? (1) : (0)","lang":"painless"}}],"caused_by":{"type":"script_exception","reason":"compile error","caused_by":{"type":"illegal_argument_exception","reason":"Extraneous conditional statement."},"script_stack":["(!((false)?(doc[\"a\"].values. ..."," ^---- HERE"],"script":"(!((false)?(doc[\"a\"].values.size()==0):((doc[\"a\"].values).contains(\"e\")))) ? (1) : (0)","lang":"painless"}},"status":500}
File "C:\Users\user\PycharmProjects\ActiveData\pyLibrary\env\elasticsearch.py", line 755, in post
File "C:\Users\user\PycharmProjects\ActiveData\pyLibrary\env\elasticsearch.py", line 755, in post
File "C:\Users\user\PycharmProjects\ActiveData\pyLibrary\env\elasticsearch.py", line 1090, in search
File "C:\Users\user\PycharmProjects\ActiveData\jx_elasticsearch\es09\util.py", line 40, in post
File "C:\Users\user\PycharmProjects\ActiveData\jx_elasticsearch\es14\setop.py", line 194, in extract_rows
File "C:\Users\user\PycharmProjects\ActiveData\jx_elasticsearch\es14\setop.py", line 64, in es_setop
File "C:\Users\user\PycharmProjects\ActiveData\jx_elasticsearch\jx_usingES.py", line 160, in query
File "C:\Users\user\PycharmProjects\ActiveData\jx_python\jx.py", line 71, in run
File "C:\Users\user\PycharmProjects\ActiveData\active_data\actions\jx.py", line 62, in jx_query
File "C:\Users\user\PycharmProjects\ActiveData\active_data\__init__.py", line 54, in output
this is how the current generated query looks like.
trying to figure these out...... in the first sight, does it look correct? how should the corrected query look like...
ERROR: Problem with search (path=/testing_000_n/test_result/_search):
{
"from": 0,
"query": {"bool": {"filter": {"match_all": {}}}},
"script_fields": {
"is_c": {"script": "((false)?(doc[\"a\"].values.size()==0):((doc[\"a\"].values).contains(\"c\"))) ? (1) : (0)"},
"is_e": {"script": "((false)?(doc[\"a\"].values.size()==0):((doc[\"a\"].values).contains(\"e\"))) ? (1) : (0)"},
"not_e": {"script": "(!((false)?(doc[\"a\"].values.size()==0):((doc[\"a\"].values).contains(\"e\")))) ? (1) : (0)"}
},
"size": 10,
"stored_fields": ["a"]
}
also, another generated query in the same log...
ERROR: Problem with call to http://localhost:9200/testing_000_x/test_result/_search
{"query": {"bool": {"filter": {"match_all": {}}}}, "stored_fields": ["a"], "from": 0, "script_fields": {"not_e": {"script": "(!((false)?(doc[\"a\"].values.size()==0):((doc[\"a\"].values).contains(\"e\")))) ? (1) : (0)"}, "is_e": {"script": "((false)?(doc[\"a\"].values.size()==0):((doc[\"a\"].values).contains(\"e\"))) ? (1) : (0)"}, "is_c": {"script": "((false)?(doc[\"a\"].values.size()==0):((doc[\"a\"].values).contains(\"c\"))) ? (1) : (0)"}}, "size": 10}
"script_stack":[
"(!((false)?(doc["a"].values. ...",
" ^---- HERE"],"script":"(!((false)?(doc[\"a\"].values.size()==0):((doc[\"a\"].values).contains(\"e\")))) ? (1) : (0)","lang":"painless"}}],"caused_by":{"type":"script_exception","reason":"compile error","caused_by":{"type":"illegal_argument_exception","reason":"Extraneous conditional statement."},"script_stack":["(!((false)?(doc[\"a\"].values. ..."," ^---- HERE"],"script":"(!((false)?(doc[\"a\"].values.size()==0):((doc[\"a\"].values).contains(\"e\")))) ? (1) : (0)","lang":"painless"}},"status":500}
"(!((false)?(doc[\"a\"].values.size()==0):((doc[\"a\"].values).contains(\"e\")))) ? (1) : (0)"}
" ^---- HERE"]
(no errs in the below line.... it shows err in the above line only.., the above line has a notOperator (!) ....
may need to check the data types and data type matching in either sides of the ternary operator....
because, datatypes should match in either sides for ternary operator as per ES5. )
false is boolean datatype....so it is ok... contains function returns a boolean .... so, it is ok too... 1 and 0 are integers... but it should be ok... because in certain context we do use 0 or 1 to represent true or false.
((false)?(doc[\"a\"].values.size()==0):((doc[\"a\"].values).contains(\"e\"))) ? (1) : (0)"},
prpbably it is just the not "!".... because it didnt raise errs in other lines.... so , it may not be related to boolean... so , may not need to touch anything other than the "!" not operator... will check es5 / painless syntax for "!" (not operator)...
well, it doesn't seem to be related to "not operator " or !.
search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"testing_000_h20170713_213247","node":"OX5gNnXQRtW6gThYBtxJ4w","reason":{"type":"script_exception","reason":"compile error","caused_by":{"type":"illegal_argument_exception","reason":"Extraneous conditional statement."},"script_stack":["(!((false)?File "C:\Users\user\PycharmProjects\ActiveData\pyLibrary\env\elasticsearch.py", line 755, in post
what does (doc[\"a\"].values.size()==0) ? (1) : (0)
give?
how about (false) ? (1) : (0)
?
It may be complaining about the (false)
does contains
return a boolean?
"not_e": {
"script": "(!((false)?(doc[\"a\"].values.size()==0):((doc[\"a\"].values).contains(\"e\")))) ? (1) : (0)"
}
all datatypes should be same in this line.... but.... not sure, why size==0 exists here.....
Logically, if the value is not equal to "e" then set the scripted_field, "not_e" to 1...... otherwise , set it to "0"
if we have to check for size > 0 for field's value, so that we don't get stackoverflow or comparison with null values, then, there should be another nested ternary operator..... so i'm trying to simplify this to understand... to fix this...
pseudo_code for inner condition: (checking for greater than 0 , instead of == 0 as that is how the loop flows in this version of pseudo code... )
if (doc[\"a\"].values.size() > 0) then
if ((doc[\"a\"].values).contains(\"e\")) then "not_e" = 1
else:
"not_e" = 0
else
"not_e" = 0
if we have to rewrite the above with ternary operator , then ...
(doc[\"a\"].values.size() > 0) ?
( doc[\"a\"].values).contains(\"e\") ? 1 : 0 )
: 0
so, if we look at this there are two "?"s and two ":"s.
will try to run a similar modded one... in es-head....
ESv1.4 allowed doc[\"a\"].value
to pull the value, but it only pulled the first value. doc[\"a\"].values
pulls a an array of values; an empty list if there are no values. size()
is a function that returns the length of an array. doc[\"a\"].values.size()==0
is the same as doc[\"a\"].empty
.
if false:
temp = doc["a"].values.size()==0
else:
temp = doc["a"].values.contains("e")
if not temp:
return 1
else:
return 0
So, modded this as per the above pseudo-code and ran it in es-head. the below code snippet works fine ....
** we are supposed to make as minimal changes as possible... but i jut tried it as per my understanding to figure out what code would pass in es-head for this to work logically with es5 painless syntax
"script_fields": {
"not_e": {
"script": " ( (doc[\"a\"].values.size()>0) ? ( doc[\"a\"].values.contains(\"e\") ? 1 : 0) : 0 )"
}
}
just saw the questions posted....
here are the answers for them...
(doc[\"a\"].values.size()==0) ? (1) : (0)
returns 0
(false) ? (1) : (0)
as expected, (false) raises err...
( in one of the previous tests, we removed (false) or (true) in the condition... i guess.... as it raised some errs... that's why i kinda removed the (false) in the above queries... )
yes, contains
returns a boolean.
I tried this, to see if I can get the class name
{
"from": 0,
"query": {
"bool": {
"filter": {
"match_all": {}
}
}
},
"script_fields": {
"is_c": {
"script": "doc[\"a\"].getClass().getName()"
}
},
"size": 10,
"stored_fields": [
"a"
]
}
that did not work, but I did get the class name:
"Unable to find dynamic method [getClass] with [0] arguments for class [org.elasticsearch.index.fielddata.ScriptDocValues.Strings]."
which I googled to get here
follow the parent classes to java.util.AbstractList
; https://docs.oracle.com/javase/7/docs/api/java/util/AbstractList.html and to the parent https://docs.oracle.com/javase/7/docs/api/java/util/Collection.html
to verify that org.elasticsearch.index.fielddata.ScriptDocValues.Strings
is a Collection
which has a contains()
method that returns boolean
OK, searched for "elasticsearch "Extraneous conditional statement."" to get here https://github.com/elastic/elasticsearch/blob/master/modules/lang-painless/src/main/java/org/elasticsearch/painless/node/EConditional.java#L67 which means Elasticsearch is being smart and looking for constants; and complaining if they exist.
This is a terrible thing. We must perform constant propagation and dead code elimination to get around this.
why do we need this (false) in this code-snippet here?
if false:
temp = doc["a"].values.size()==0
else:
temp = doc["a"].values.contains("e")
if not temp:
return 1
else:
return 0
I think that (false) is extraneous and that's why the err is... that;s why avoided that (false) in the previous tries....
So, without the (false) the logic would look like this..
pseudo_code-1
if (doc[\"a\"].values.size() > 0) then
if ((doc[\"a\"].values).contains(\"e\")) then "not_e" = 1
else:
"not_e" = 0
else
"not_e" = 0
@extend(WhenOp)
def to_painless(self, not_null=False, boolean=False, many=False):
return "(" + self.when.to_painless(boolean=True) + ") ? (" + self.then.to_painless(not_null=not_null) + ") : (" + self.els_.to_painless(not_null=not_null) + ")"
will have to check if self.when.to_painless(boolean=True) is the one that is resulting in (false) and may be remove this extraneous condition check...
The code looks dumb to a human, but the machine is just writing out expressions. Consider
{"eq": {"a": "c"}}
which is the same as
EqOp("eq", [Variable("a"), Literal("c")])
when running this code:
return "(" + rhs_missing + ")?(" + lhs_list + ".size()==0):((" + lhs_list + ").contains(" + rhs + "))"
the rhs is Literal("c")
, so Literal("c").missing()
is always false
, which is put as the first parameter of the conditional. The rhs_missing
could also be True
or None
, so we have to check those cases too.
(1==2)?(doc[\"a\"].values.size()==0):((doc[\"a\"].values).contains(\"c\"))) ? (1) : (0)
Has the same problem: Elasticsearch knows that 1
is a constant, 2
is a constant, therefore 1==2
is a constant, so throw an error. Our code must be smarter than the Elasticsearch code if we are going to trick it into running our painless scripts. Our code must perform the comparision and simplify the code based on the conditional.
this script runs fine for "is_c" and "is_e"....
but raises err for "not_e" ...
{
"from": 0,
"query": {
"bool": {
"filter": {
"match_all": {}
}
}
},
"script_fields": {
"is_c": {
"script": "(doc[\"a\"].values.size()==0 ? ( (doc[\"a\"].values).contains(\"c\" ) ? (1) : (0 ) ):0)"
},
"is_e": {
"script": "(doc[\"a\"].values.size()==0 ? ( (doc[\"a\"].values).contains(\"e\" ) ? (1) : (0 ) ):0)"
}
},
"size": 10,
"stored_fields": [
"a"
]
}
"not_e" raises err.. because, need to figure out which adds the notOperator "!" and then balance the parentheses for notOperator... ( it balances the param for the above scripted fields and they run fine... but not the below one... )
"not_e": {
"script": "(!(doc[\"a\"].values.size()==0 ? ( (doc[\"a\"].values).contains(\"e\" ) ) ? (1) : (0 ) ):0)"
}
We will now trick painless to do what we want: Replace all instances of false
with System.currentTimeMillis() < 0
and true
with System.currentTimeMillis()>0
.
I added partial evaluation code
if boolean:
return WhenOp(
"when",
self.rhs.missing(),
**{"then": self.lhs.missing(), "else": InOp("in", [self.rhs, self.lhs])}
).partial_eval().to_painless(boolean=True)
else:
return WhenOp(
"when",
OrOp("or", [self.rhs.missing(), self.lhs.missing()]),
**{"then": NullOp(), "else": InOp("in", [self.rhs, self.lhs])}
).partial_eval().to_painless(boolean=True)
All expressions now have a partial_eval
method to simplify literal expressions. This is not completed for all expressions, but it is done enough to pass the test. All the non-trivial to_painless
code must construct an expression and run partial_eval
. String concatenation will no longer work in all cases.
def partial_eval(self):
when = self.when.partial_eval()
if isinstance(when, Literal):
if isinstance(when, TrueOp):
return self.then.partial_eval()
elif isinstance(when, (FalseOp, NullOp)):
return self.els_.partial_eval()
else:
Log.error("Expecting `when` clause to return a Boolean, or `null`")
else:
self.simplified = True
return self
The common patterns to notice here: Each expression will look for Literal
parameters that will help simplify the logic; there is always the case that the expression can not be constant-folded, so we return self; we set the simplified
parameter so we do not go into logic loops.
fix unit test test_jx.test_expressions_w_set_ops.TestSetOps.test_select_when_on_multivalue