mhahsler / arules

Mining Association Rules and Frequent Itemsets with R
http://mhahsler.github.io/arules
GNU General Public License v3.0
194 stars 42 forks source link

kappa and leastContradiction #29

Closed mhahsler closed 7 years ago

mhahsler commented 7 years ago

My name is Feng, I am using your arules packages 1.5.2 to find the related products in my retail data. Now, I am confused about two measures in the function interestMeasure: kappa and leastContradiction.

In the package manual, there is a piece of code of explaining how to use interestMeasure. I change the code a little bit:

data("Income") rules <- apriori(Income) quality(rules)$kappa <- interestMeasure(rules,measure='kappa',transactions = Income) quality(rules)$leastContradiction <- interestMeasure(rules,measure='leastContradiction',transactions = Income) try <- as(rules,'data.frame')

Then, we can see the ranges of these two measures are:

summary(try$leastContradiction) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.08794 0.13920 0.17000 0.18930 0.22170 0.90460 summary(try$kappa) Min. 1st Qu. Median Mean 3rd Qu. Max. -43160000 -20510000 -19140000 -17660000 -12220000 -8042000

You can see the range of kappa is so different from what the manual describes: [-1,1]

When I use these two measures on my own data, I have:

summary(myData1$kappa) Min. 1st Qu. Median Mean 3rd Qu. Max. -5767000000000 -5765000000000 -5756000000000 -5745000000000 -5728000000000 -5610000000000 summary(myData1$leastContradiction) Min. 1st Qu. Median Mean 3rd Qu. Max. -218.9000 -5.4530 -2.0120 -4.9540 -1.1050 0.8824

Could you please explain to me how to use these two measures? Thanks a lot

Feng

mhahsler commented 7 years ago

This was indeed a bug. Resolution: Added missing parentheses in kappa calculation and fixed equation for least contradiction. The fix is now available in the development version on github and will be part of the next release (arules 1.2-3).