mhahsler / arules

Mining Association Rules and Frequent Itemsets with R
http://mhahsler.github.io/arules
GNU General Public License v3.0
194 stars 42 forks source link

is.redundant: Error: cannot allocate vector of size 250.0 Gb #4

Closed sjain777 closed 8 years ago

sjain777 commented 8 years ago

Hi, My arulesModel model contains 260K rules and has a size of 17 MB. Upon applying the is.redundant method, memory limit is reached on a machine with 16 GB RAM:

is.redundant(arulesModel)

Error: cannot allocate vector of size 250.0 Gb In addition: Warning messages: 1: In .local(x, y, proper, sparse, ...) : Reached total allocation of 16274Mb: see help(memory.size) 2: In .local(x, y, proper, sparse, ...) : Reached total allocation of 16274Mb: see help(memory.size) 3: In .local(x, y, proper, sparse, ...) : Reached total allocation of 16274Mb: see help(memory.size) 4: In .local(x, y, proper, sparse, ...) : Reached total allocation of 16274Mb: see help(memory.size)

memory.limit() [1] 16274 # => 16 GB

How to remove redundant rules without hitting the memory limit? Thanks!

mhahsler commented 8 years ago

The issue is that the is redundant function needs to compare all rules with each other and thus needs O(n^2) space which obviously does not fit into memory in your case. A newer version of is.redundant will be released in the next version of arules that should be able to handle this better.

sjain777 commented 8 years ago

Thanks much for your feedback! I look forward to getting the new release.