Closed GoogleCodeExporter closed 9 years ago
Thanks for reporting the bug. Can you show me the original query as put into
the query bar in the web interface? The patch you've supplied would change the
matching logic from a boolean OR to an AND, and I need to be sure that's what
should happen here.
Original comment by mchol...@gmail.com
on 18 Jun 2013 at 2:30
timeout:0 +host=127.0.0.1 program="smartdefense" program="waf_wf" class=FW_IPS
class=WAF -WAF.method=OPTIONS -WAF.method=PROPFIND -WAF.attack=COOKIE_TAMPERED
-"Lan7.128" -"IPS Bypass" -FW_IPS.srcip=10.20.5.114 -WAF.srcip=192.174.55.105
-FW_IPS.dstip=192.174.58.21 -FW_IPS.attack=ipfragments
-FW_IPS.attack=tcp_invalid_checksum_enable -FW_IPS.attack=tcp_segment_limit
-FW_IPS.attack=tcp_block_retrans_err_enable
-FW_IPS.attack=tcp_block_urg_bit_enable -FW_IPS.attack=streamingengine
-FW_IPS.message_info="DNS data is too long" -WAF.message_info="url-references"
-WAF.message_info="webtop-error" -WAF.message_info="403"
-WAF.site="autodiscover.xml" -WAF.site="javascript"
-WAF.site="afdcredesign.afdcstage.nrel.gov" -WAF.rule="Deny-Internal"
With the patch, it generates this MATCH:
MATCH(' ((@host 2130706433)) !("IPS
Bypass"|"Lan7.128"|(10.20.5.114|169084274)|(192.174.58.21|3232643605)) ( !((@s0
COOKIE_TAMPERED)|(@s1 "Deny\\-Internal")|(@s3 "url\\-references")|(@s3
"webtop\\-error")|(@s3 403)|(@s4 OPTIONS)|(@s4 PROPFIND)|(@s5
"afdcredesign.afdcstage.nrel.gov")|(@s5 "autodiscover.xml")|(@s5
"javascript"))) ( !((@s1 ipfragments)|(@s1 streamingengine)|(@s1
tcp_block_retrans_err_enable)|(@s1 tcp_block_urg_bit_enable)|(@s1
tcp_invalid_checksum_enable)|(@s1 tcp_segment_limit)|(@s2 "DNS data is too
long")))')
Without the patch it generates this MATCH, which sphinx throws an error on:
MATCH(' ((@host 2130706433)) !("IPS
Bypass"|"Lan7.128"|(10.20.5.114|169084274)|(192.174.58.21|3232643605)) ( !((@s0
COOKIE_TAMPERED)|(@s1 "Deny\\-Internal")|(@s3 "url\\-references")|(@s3
"webtop\\-error")|(@s3 403)|(@s4 OPTIONS)|(@s4 PROPFIND)|(@s5
"afdcredesign.afdcstage.nrel.gov")|(@s5 "autodiscover.xml")|(@s5
"javascript"))| !((@s1 ipfragments)|(@s1 streamingengine)|(@s1
tcp_block_retrans_err_enable)|(@s1 tcp_block_urg_bit_enable)|(@s1
tcp_invalid_checksum_enable)|(@s1 tcp_segment_limit)|(@s2 "DNS data is too
long")))')
Original comment by and...@wolftrail.com
on 18 Jun 2013 at 7:18
I noticed another bug, it is losing the WAF.srcip from the resulting MATCH for
some reason, I'll look into that and see if I can fix it.
Original comment by and...@wolftrail.com
on 18 Jun 2013 at 7:26
Per comment #3 - If the srcip term comes after other field matches for the
specified class then the IPs are not added to the MATCH, this is the correct
behavior, the IPs should not be in the MATCH in this case.
On the other hand, if like in my example, you have an a term like srcip before
any other fields for the specified class, then the IP address is added to the
MATCH incorrectly.
The issue is line 1282 in Query.pm:
elsif ($term_hash->{op} !~ /[\<\>]/ and not exists
$self->terms->{field_terms}->{$boolean}->{$class_id}){
The exists test fails if the parser hasn't hit any field_terms for the class
already. It might make sense for the parser to move attribute terms to the end
of a query string so that they are parsed correctly regardless of how the user
types them in.
Original comment by and...@wolftrail.com
on 18 Jun 2013 at 8:06
I've committed the fix for the reported issue, but I'm still unclear on the
second bug. Can you provide the shortest possible example query with the IP
that is incorrectly added to the match string, and how it is incorrect?
Original comment by mchol...@gmail.com
on 21 Jun 2013 at 6:33
I added the following to line 1295 of Query.pm to help troubleshoot this.
$self->log->trace('Adding IP to any_field_terms: ' . $term_hash->{value});
So here is a short example query:
class=A class=B -A.srcip=192.168.1.1 -A.type=alert -B.user=luser
-B.dstip=10.0.0.2
In this example I want logs that are from class A or B, except for class A logs
from 192.168.1.1 or class A logs with type of alert or class B logs with a user
of luser or a destination of 10.0.0.2. Because of the bug I would instead get
logs from class A or class B except for any logs with a the string 192.168.1.1
anywhere in the message or class A logs with type of alert or class B logs with
a user of luser or a destination of 10.0.0.2.
If I rewrite the query like this then it works correctly:
class=A class=B -A.type=alert -B.user=luser -B.dstip=10.0.0.2
-A.srcip=192.168.1.1
However, this still leaves the problem of when you only search for an attribute
field for a class, like the following query:
class=A class=B -A.type=alert -B.dstip=10.0.0.2 -A.srcip=192.168.1.1
Can you explain the purpose of adding 10.0.0.2 to any_field_terms in this
instance? It seems to needlessly exclude potential results. For example a log
entry of class A with a source IP of 10.0.0.2 would be excluded in the last
example by the MATCH generated by the code, which is not what someone writing
the query would expect.
I think if you just comment out line 1295 then everything would work correctly.
I haven't been able to come up with a scenario where the search breaks with
that line commented out.
Original comment by and...@wolftrail.com
on 21 Jun 2013 at 8:26
Ok, yes I'm seeing the issue now. The problem is that integer fields aren't
currently indexed like string fields are. This is an optimization to keep
indexes smaller because searching the message will definitely find the string,
since an integer is just one word. The attribute filter ensures that only the
right matches come back. However, you've shown an edge case which only negated
values don't work correctly. This used to simply come back with an error
saying there were no positive values in the query, but now that class-only
queries can be run normally (they use SQL instead of Sphinx), this has come to
light. Hopefully that explains why the field term is put into the
any_field_terms matching clause. I'm working on a fix for the ordering problem.
Original comment by mchol...@gmail.com
on 24 Jun 2013 at 3:06
Ah, that makes sense. The weird bugs always pop up when the underlying
assumptions change. Thanks for the feedback, let me know if there's anything I
can help with.
Original comment by and...@wolftrail.com
on 24 Jun 2013 at 3:47
This is fixed in rev 944
Original comment by mchol...@gmail.com
on 1 Jul 2013 at 3:02
Original issue reported on code.google.com by
and...@wolftrail.com
on 17 Jun 2013 at 9:32