piskvorky / bounter

Efficient Counter that uses a limited (bounded) amount of memory regardless of data size.
MIT License
934 stars 47 forks source link

__contains__ doesn't work with python3 #32

Closed jayantj closed 5 years ago

jayantj commented 6 years ago

Code to Reproduce

from bounter import bounter

lines = [
    ['some', 'sentence'],
    ['another', 'sentence'],
    ['here', 'have', 'some', 'more'],
    ['you', 'should', 'stop', 'now'],
]

counter = bounter(size_mb=1024, need_iteration=False, log_counting=1024)

for line in lines:
    bigrams = zip(line, line[1:])
    counter.update(line)
    counter.update(' '.join(pair) for pair in bigrams)

print(counter['some'])
>>> 2

print('some' in counter)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-16-f89596bc8ae8> in <module>()
----> 1 print('some' in counter)

/home/jayant/miniconda3/envs/pii_tools_env/lib/python3.5/site-packages/bounter/count_min_sketch.py in __getitem__(self, key)
    127 
    128     def __getitem__(self, key):
--> 129         return self.cms.get(key)
    130 
    131     def cardinality(self):

TypeError: The parameter must be a unicode object or bytes buffer!

print(b'some' in counter)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-17-071c2b90c294> in <module>()
----> 1 print(b'some' in counter)

/home/jayant/miniconda3/envs/pii_tools_env/lib/python3.5/site-packages/bounter/count_min_sketch.py in __getitem__(self, key)
    127 
    128     def __getitem__(self, key):
--> 129         return self.cms.get(key)
    130 
    131     def cardinality(self):

TypeError: The parameter must be a unicode object or bytes buffer!

Is __contains__ supported? I haven't tried with python2.

piskvorky commented 6 years ago

Even if it isn't supported—and I think it should be—that error message isn't right. Marking as a bug; thanks for reporting.

menshikh-iv commented 6 years ago

Reproduced (for py2 too), this is definitely a bug, __contains__ should be supported, thank you @jayantj!

aneesh-joshi commented 5 years ago

Fixed(?) in https://github.com/RaRe-Technologies/bounter/pull/38