RokIvansek / attribute-interactions

A python implementation of attribute interactions. An Orange add-on (scripting part only).
2 stars 0 forks source link

Speed up entropy function #15

Closed RokIvansek closed 8 years ago

RokIvansek commented 8 years ago

The entropy function now uses np.unique to get unique values in input array. This info is already present in the domain of an orange table. So instead of running np.unique just access this info from data domain using data.domain.variables[i].values and data.domain.class_var.values. The method must then except integers corresponding to consecutive column numbers instead of the actual arrays.

RokIvansek commented 8 years ago

Turns out this does not yield a substantial speed up. Because it introduces some confusing code I will not include it.

getprobs is the method that uses the data.domain info about unique values and get_probs is the method that calls np.unique

Testing for 100000 samples calculating probabilities for 3 attributes: Time get_probs: 0.06428435299979658 Time getprobs: 0.046929360999759716

Testing for 1000000 samples: Time get_probs: 0.691636645666828 Time getprobs: 0.4972618786669045