rasbt / mlxtend

A library of extension and helper modules for Python's data analysis and machine learning libraries.
https://rasbt.github.io/mlxtend/
Other
4.92k stars 873 forks source link

fpmax gives wrong support #690

Closed connesy closed 4 years ago

connesy commented 4 years ago

MWE:

>>> import pandas as pd
>>> import mlxtend
>>> pd.__version__
'1.0.2'
>>> mlxtend.__version__
'0.17.2'
>>> from mlxtend.frequent_patterns import fpmax
>>> df = pd.DataFrame([[1,1,0],[1,0,1],[0,0,1]], columns=['a','b','c'])
>>> print(df)
   a  b  c
0  1  1  0
1  1  0  1
2  0  0  1
>>> print(fpmax(df, min_support=0.01, use_colnames=True))
    support itemsets
0  0.333333   (b, a)
1  0.666667   (c, a)
rasbt commented 4 years ago

Yes, thanks, I think it should be 0.333333 for (a, c) as well. Let me CC @harenbergsd how implemented the algorithm

connesy commented 4 years ago

I have tracked it down to commit 115278b. Before this commit (at commit ac0f0c1), the above example returns:

>>> fpmax(df, min_support=0.01, use_colnames=True)
    support itemsets
0  0.333333   (a, b)
1  0.333333   (a, c)
harenbergsd commented 4 years ago

Thanks guys, I will check it out

rasbt commented 4 years ago

@connesy : Thanks to @harenbergsd , it should be fixed now :)

In order to use it, you can install the latest dev version via

pip install git+git://github.com/rasbt/mlxtend.git

(until the next version release)

connesy commented 4 years ago

Thanks!