Closed markwagy closed 7 years ago
Hi @mwagyuvm
Can you please submit a pull request?
Hi,Why does the programme not implement correctly with my data?
Hi @bixiang
Can you please elaborate?
This is a news dataset.I can't acquir the rules with your programme.
------------------ 原始邮件 ------------------ 发件人: "asaini";notifications@github.com; 发送时间: 2015年7月14日(星期二) 晚上8:33 收件人: "asaini/Apriori"Apriori@noreply.github.com; 抄送: "晴雨如天"631885006@qq.com; 主题: Re: [Apriori] Code does not implement APriori correctly (#5)
Hi @bixiang
Can you please elaborate?
— Reply to this email directly or view it on GitHub.
Hello!Have you seen the dataset sent to you?Please tell me what's the reason.....
I've read the code, and as far as I understood this is the actual implementation based on this scientific publication: Agrawal, Rakesh, and Ramakrishnan Srikant. "Fast algorithms for mining association rules." Proc. 20th int. conf. very large data bases, VLDB. Vol. 1215. 1994. LINK
@bixiang Sometimes due to the nature of the dataset can be possible that associations rules does not provide useful information from a dataset. Anyway if you think that there is something wrong you can always implement your own version, or open a pull request
Thanks @sn1p3r46 for the comment.
Closing this issue. No response from OP
Hi. I don't think that this code implements the APriori algorithm correctly.
The reason I say this is due to the subset comparison in the "returnItemsWithMinSupport()" function. The code is comparing each item with a given transaction row by relying on a subset comparison rather than doing a comparison that preserves the column identity of the item.
For example, if we have the following transactions with column headers A, B as follows:
A,B
a,b b,a
The support for the rule {A=a} is 50% since there is one transaction in which A=a. But the way that the code is implemented, we get a support of 100% because A=a and B=a are both (incorrectly) deemed support for either {A=a} or {B=a}.