RokIvansek / attribute-interactions

A python implementation of attribute interactions. An Orange add-on (scripting part only).
2 stars 0 forks source link

Look into the new entropy fucntion, try to optimize it #8

Closed RokIvansek closed 8 years ago

RokIvansek commented 8 years ago

http://stackoverflow.com/questions/16970982/find-unique-rows-in-numpy-array

http://stackoverflow.com/questions/26998223/what-is-the-difference-between-contiguous-and-non-contiguous-arrays

RokIvansek commented 8 years ago

Done with commit 4936be25a9b4bd01494120e91df94a4920c2ae83

RokIvansek commented 8 years ago

If it turns out to be too slow, you could look into Orange.statistics.contingency which is optimized for speed. However it does not add additive smoothing and not sure if it works for sparse tables...

RokIvansek commented 8 years ago

Orange.statistics.contingency handles sparse data. Speed measurements are:

Testing for 1000000 samples, not sparse: time my contingency: 0.4261052516667405 time Orange contingency: 0.019851794000108686 Testing for 1000000 samples, sparse time my contingency: 0.4150877369999459 time Orange contingency: 0.051676561333200276