Qalculate / libqalculate

Qalculate! library and CLI
https://qalculate.github.io/
GNU General Public License v2.0
1.9k stars 151 forks source link

Adding entropy (information theory) #284

Open wiz21b opened 3 years ago

wiz21b commented 3 years ago

Hello,

Could you, if it does make sense, add the entropy function to libqalculate ? And add its "Information theory" to the list of type of function ? right now, I define a function myself but I thought it's sufficiently fundamental to be added to the core of the library. The only question is which unit to return : bits or shannon.

wiz21b commented 3 years ago

Please note that entropy is defined either for a real number (a probability) or a vector (list of probabilities).

wiz21b commented 3 years ago

Right now I use this function: if( sum(element(\x, x),1,dimension(\x)) = 0, 0, \x * log2(\x)). But it's not satisfactory because it must take a vector as a parameter. That prevents me to write H(0.5) for example. Therefore, I tried :

if( isNumber(\x),
   if( x = 0, 0, \x*log2(\x)),
   if( sum(element(\x, x),1,dimension(\x)) = 0, 0, \x * log2(\x)))

but this doesn't work as it doesn't evaluate the 'if' inside the 'if'...

wiz21b commented 3 years ago

Ah, I had to use sub functions to declare the inner if's. Now it works. So I have : if( isNumber(\x), \1, \2) with \1 = if( \x = 0, 0, \x*log2(\x)), and \2 = if( sum(element(\x, x),1,dimension(\x)) = 0, 0, \x * log2(\x))

I leave my previous comments so you see what was my first idea to make the function and how I fixed it. Please not I didn't find anything about sub functions in the user manual...

wiz21b commented 3 years ago

Now I have found a small reference to sub functions in the documentation, but it's far from clear :-) I checked the API too but it was a too hard to grasp.

hanna-kn commented 3 years ago

You missed the backslash on the second x. The following works as expected.

if( isNumber(\x),
   if( \x = 0, 0, \x*log2(\x)),
   if( sum(element(\x, x),1,dimension(\x)) = 0, 0, \x * log2(\x)))

Note that \x * log2(\x), when \x is a vector, is ambiguous and will not work in the upcoming version (use dot(\x, log2(\x)) to retain the behaviour of the current version).

Here is an alternative version: if([total(\x)=0, isNumber(\x)], [0, \x*log2(\x)], dot(\x, log2(\x)))

Even better if you specify the parameter as an vector (you will not need to use brackets for the vector, and a single value will work as expected): if(total(\x)=0, 0, dot(\x, log2(\x)))

wiz21b commented 3 years ago

Ooooh! thx for all the answers! I've tried your suggestion to use the "dot" function, it doesn't work (see below, execution of qalc) (3.18, downloaded today from your website).

> info dot
No function, variable, unit, or prefix with specified name exist.

> dot( vector(1,2), vector(3,4) )
error: "o" is not a valid variable/function/unit.
day × (0 × tonne) × [[1, 2], [3, 4]] = [[0 t·d, 0 t·d], [0 t·d, 0 t·d]]

(I understand dot() is the dot product; i.e. element-wise multiplication).

hanna-kn commented 3 years ago

Sorry. The dot() will be available in version 3.19 (tomorrow). Replace with multiplication operator for older versions (a bit unfortunate).