druid-io / pydruid

A Python connector for Druid
Other
506 stars 194 forks source link

Filters extraction function #76

Closed gaetano-guerriero closed 6 years ago

gaetano-guerriero commented 7 years ago

This adds support for filters of type extraction, using an extraction function.

Extraction functions are also supported with filters of compatible type (this works only with druid >= 0.9.1)

magor commented 6 years ago

Where can I vote for this to be merged? :)

magor commented 6 years ago

bump; looks like it's ready, should I wait for this (any ETA?) or use my own patched version?

dyangelo-grullon commented 6 years ago

anyone home?

gianm commented 6 years ago

I think none of the druid core committers are actively using pydruid so that explains why it hasn't been committed yet. Is anyone here using this patch locally and/or has confirmed that it passes unit tests? If so I can commit it.

Erbureth commented 6 years ago

Hi,

we have tested it and it looks OK - at least it doesn't break any passing tests in PyDruid test suite nor our own.

One small remark - we had to create our own extraction class so we could use the lookups in filters:

class RegisteredLookupExtraction(LookupExtraction):

    extraction_type = 'registeredLookup'

    def __init__(self, lookup, **kwargs):
        super(RegisteredLookupExtraction, self).__init__(**kwargs)
        self._lookup = lookup

    def build_lookup(self):
        return self._lookup
gaetano-guerriero commented 6 years ago

I can confirm this is working right here.

@Erbureth I cannot understand, something like this is working for me:

topn(
            ...
            filter=Filter(
                 extraction_function=MapLookupExtraction({'iPhone': 'N9'}),
                 type='in', values=['N9'], dimension='device',
                 .....
             ....
)
Erbureth commented 6 years ago

@gaetano-guerriero Simple map lookups are insufficient for us, we have big dictionaries fed from Kafka topics, each registered under specific name and running with corresponding tasks on middle managers, brokers and historical nodes. This helper class is perfectly usable for our scenario and working well with your code.

mistercrunch commented 6 years ago

@gaetano-guerriero I'm taking over maintaining this library. Mind rebasing this PR?

gaetano-guerriero commented 6 years ago

Hi, this is now rebased.

ghulands commented 6 years ago

@gaetano-guerriero Looks like you need to fix some code formatting.

gaetano-guerriero commented 6 years ago

I guess this was failing in master too ? Indeed #115 did same fixes. Anyway I added a commit for flake8, in case you want to merge this PR first.

mistercrunch commented 6 years ago

Still merge conflicts, mind rebasing again?

gaetano-guerriero commented 6 years ago

rebased