Open Bushra-Aljbawi opened 4 years ago
If it's gonna help someone, saving the dataframe of the itemsets to a pickle rather than csv file solved the problem for me.
Glad you were able to solve the problem. Yeah, CSV files are not really able to store Python objects like frozensets.
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori
dataset = [['Milk', 'Onion', 'Nutmeg', 'Kidney Beans', 'Eggs', 'Yogurt'],
['Dill', 'Onion', 'Nutmeg', 'Kidney Beans', 'Eggs', 'Yogurt'],
['Milk', 'Apple', 'Kidney Beans', 'Eggs'],
['Milk', 'Unicorn', 'Corn', 'Kidney Beans', 'Yogurt'],
['Corn', 'Onion', 'Onion', 'Kidney Beans', 'Ice cream', 'Eggs']]
te = TransactionEncoder()
te_ary = te.fit(dataset).transform(dataset)
df = pd.DataFrame(te_ary, columns=te.columns_)
frequent_itemsets = apriori(df, min_support=0.6, use_colnames=True)
frequent_itemsets.to_csv('my.csv', index=None)
frequent_itemsets.head()
df = pd.read_csv('my.csv')
df.head()
However, if you would like to create a CSV-friendlier presentation, you could write a custom function to help with this. Something along the lines of
def frozenset_to_str(x):
x = list(x)
x = str(x).lstrip('[').rstrip(']').strip()
return x
frequent_itemsets['itemsets'] = frequent_itemsets['itemsets'].apply(lambda x: frozenset_to_str(x))
frequent_itemsets
I can reopen this issue to remind myself to add something to the documentation to clarify it / provide an example
Hi, I'm also experiencing the same error:
You are likely getting this error because the DataFrame is missing antecedent and/or consequent information. You can try using the `support_only=True` option"
I'm using mlxtend 0.23.0
and Python 3.8.13
I'm only experiencing this error when trying to run association_rules
after using the fpmax
algorithm to generate the frequent_itemsets. I don't experience this error when running association_rules
after using fpgrowth
, hmine
, or apriori
to generate the frequent_itemsets. All else is equal (i.e., frequent_itemsets are generated using min_support of 0.5 and max_len of 2; association_rules parameters are metric="lift" and min_threshold=1)
Like the OP, I understand that it's because the code can't find the support of one of the antecedents/consequent items. In the thread https://github.com/rasbt/mlxtend/issues/390 , I saw @rasbt post:
And the logic re: "The support of at least one of the two is 0.253623, but the support for the other item might be higher."
makes sense to me, but what doesn't make sense to me is how the the item could be missing from frequent_itemsets if it is at least the min_support or higher.
Would appreciate any help. Let me know if I am missing something or have a misunderstanding.
Thanks for bringing that up. Unfortunately, I currently don't have the capacity to dive into the code and see what's going on (due to other projects and deadlines) but this is worth investigating.
Experiencing the same issue with fpmax even when working with the documentation dataset.
To reproduce the error: `import pandas as pd from mlxtend.preprocessing import TransactionEncoder from mlxtend.frequent_patterns import apriori, fpmax, fpgrowth
dataset = [['Milk', 'Onion', 'Nutmeg', 'Kidney Beans', 'Eggs', 'Yogurt'], ['Dill', 'Onion', 'Nutmeg', 'Kidney Beans', 'Eggs', 'Yogurt'], ['Milk', 'Apple', 'Kidney Beans', 'Eggs'], ['Milk', 'Unicorn', 'Corn', 'Kidney Beans', 'Yogurt'], ['Corn', 'Onion', 'Onion', 'Kidney Beans', 'Ice cream', 'Eggs']]
te = TransactionEncoder() te_ary = te.fit(dataset).transform(dataset) df = pd.DataFrame(teary, columns=te.columns)
frequent_itemsets = fpmax(df, min_support=0.1, use_colnames=True)
frequent_itemsets` returns this output
association_rules(frequent_itemsets, metric="confidence", min_threshold=0.7)
Returns the below error
cc @rasbt, @chenrocky any updates on this error raising with fpmax? Thanks!
Hi, thanks a lot for the amazing repository.
I'm using this to generate association rules. Extracting the itemsets of different lengths is not a problem. Also, extracting the itemsets and directly using it to generate rules works. However, after saving the itemsets to a file, generating the rules after reading the saved file generates this error:
KeyError: 'frozenset({\'1\', \'z\', \'f\', \'a\', "\'", \')\', \'l\', \'(\', \'o\', \'k\', \'r\', \' \', \'b\', \'%\', \'e\', \'s\', \'m\', \'}\', \'{\', \'i\', \'u\', \'t\'})You are likely getting this error because the DataFrame is missing antecedent and/or consequent information. You can try using the
support_only=True
option'I understand that it's because the code can't find the support of one of the antecedents/consequent items but have no idea how to solve it? I've read all the possible solutions in this thread: https://github.com/rasbt/mlxtend/issues/390 but none of them works. Also tried to save the itemsets to a csv file separated by ; rather than , to avoid special characters problem but still have the problem.
I'm using: python 3.7 mlxtend version: 0.17.2
Would appreciate any idea, Thanks. Bushra.