Code does not implement APriori correctly

asaini / Apriori

Python Implementation of Apriori Algorithm for finding Frequent sets and Association Rules

MIT License

771 stars 435 forks source link

Code does not implement APriori correctly #5

Closed markwagy closed 7 years ago

markwagy commented 9 years ago

Hi. I don't think that this code implements the APriori algorithm correctly.

The reason I say this is due to the subset comparison in the "returnItemsWithMinSupport()" function. The code is comparing each item with a given transaction row by relying on a subset comparison rather than doing a comparison that preserves the column identity of the item.

For example, if we have the following transactions with column headers A, B as follows:

A,B

a,b b,a

The support for the rule {A=a} is 50% since there is one transaction in which A=a. But the way that the code is implemented, we get a support of 100% because A=a and B=a are both (incorrectly) deemed support for either {A=a} or {B=a}.

asaini commented 9 years ago

Hi @mwagyuvm

Can you please submit a pull request?

bixiang commented 9 years ago

Hi,Why does the programme not implement correctly with my data?

asaini commented 9 years ago

Hi @bixiang

Can you please elaborate?

bixiang commented 9 years ago

This is a news dataset.I can't acquir the rules with your programme.

------------------ 原始邮件 ------------------ 发件人: "asaini";notifications@github.com; 发送时间: 2015年7月14日(星期二) 晚上8:33 收件人: "asaini/Apriori"Apriori@noreply.github.com; 抄送: "晴雨如天"631885006@qq.com; 主题: Re: [Apriori] Code does not implement APriori correctly (#5)

Hi @bixiang

Can you please elaborate?

— Reply to this email directly or view it on GitHub.

bixiang commented 9 years ago

Hello!Have you seen the dataset sent to you?Please tell me what's the reason.....

sn1p3r46 commented 8 years ago

I've read the code, and as far as I understood this is the actual implementation based on this scientific publication: Agrawal, Rakesh, and Ramakrishnan Srikant. "Fast algorithms for mining association rules." Proc. 20th int. conf. very large data bases, VLDB. Vol. 1215. 1994. LINK

@bixiang Sometimes due to the nature of the dataset can be possible that associations rules does not provide useful information from a dataset. Anyway if you think that there is something wrong you can always implement your own version, or open a pull request

asaini commented 7 years ago

Thanks @sn1p3r46 for the comment.

Closing this issue. No response from OP