druid-io / pydruid

A Python connector for Druid
Other
509 stars 200 forks source link

Filter operands are modified during "and" and "or" operations #101

Open sajomathews opened 7 years ago

sajomathews commented 7 years ago

When combining filters using & and | operators, the operands are modified. This might lead to unexpected results.

This only happens, if the type of operation is the same as the type of the first filter.

thorbjornwolf commented 6 years ago

+1, I just got bitten by the same behavior in pydruid==0.4.2. Here is the smallest example I could make up:

In [1]: from pydruid.utils.filters import Dimension
   ...: from pydruid.utils.filters import Filter
   ...: 
   ...: a = (Dimension('alpha') == 1)
   ...: b = (Dimension('bravo') == 200)
   ...: f = Filter(type='in', dimension='foxtrot', values=[1, 2, 3])
   ...: 
   ...: a_and_b = a & b
   ...: print('Filters in a_and_b:', len(a_and_b.filter['filter']['fields']))
   ...: 
   ...: 
Filters in a_and_b: 2

In [2]: _ = a_and_b & f
   ...: print('Filters in a_and_b:', len(a_and_b.filter['filter']['fields']))
   ...: 
   ...: 
Filters in a_and_b: 3

The seemingly innocent a_and_b & f modifies a_and_b, which, for me, is entirely unexpected.

thorbjornwolf commented 6 years ago

Well well well! A tiny investigation reveals that this is intended behavior, at least in the source code comments:

if self is already or, don't create a new filter but just append x to the filter fields.

@dakra (thanks for your work!!), it looks like you wrote it back in the day :) Is the above the intended behavior, or is it a use-case that wasn't considered in the design?