mhahsler / arules

Mining Association Rules and Frequent Itemsets with R
http://mhahsler.github.io/arules
GNU General Public License v3.0
194 stars 42 forks source link

Prediction method for apriori #10

Closed sjain777 closed 8 years ago

sjain777 commented 8 years ago

Hi, is there a "predict" method for apriori similar to predict.rpart, etc? Currently I have my own code built following the suggestion at: http://stats.stackexchange.com/questions/21340/finding-suitable-rules-for-new-data-using-arules `basket <- Groceries[2]

find all rules, where the lhs is a subset of the current basket

rulesMatchLHS <- is.subset(rules@lhs,basket)

and the rhs is NOT a subset of the current basket (so that some items are left as potential recommendation)

suitableRules <- rulesMatchLHS & !(is.subset(rules@rhs,basket))

here they are

inspect(rules[suitableRules])

now extract the matching rhs ...

recommendations <- strsplit(LIST(rules[suitableRules]@rhs)[[1]],split=" ") recommendations <- lapply(recommendations,function(x){paste(x,collapse=" ")}) recommendations <- as.character(recommendations)

... and remove all items which are already in the basket

recommendations <- recommendations[!sapply(recommendations,function(x){basket %in% x})]

print(recommendations)`

but that takes enormously long (several hours) to process a testdata that is 20,000 rows with an apriori model having about 300,000 rules. Wanted to check if there exists any method already that processing a table of testdata much faster (ideally, a few seconds), or if there is plan of developing such a method in the near future?

Thanks! Supriya

mhahsler commented 8 years ago

Give package recommenderlab a try...

sjain777 commented 8 years ago

ok, thanks for your feedback.