Closed mamaral closed 7 years ago
Hi,
I reproduce your execution traces and results. The classifier is:
classFeatures: { one: { '0': 3, '1': 2, '2': 3 }, two: { '3': 4, '4': 4, '5': 2 } }, classTotals: { one: 3, two: 4 }, totalExamples: 6, smoothing: 1 }
Which give us:
3/3 x 2/3 x 3/3 x 1/3 x 1/3 x 1/3 = 1 x 0.66 x 1 x 0.33 x 0.33 x 0.33 = 0.023718420000000004 1/4 x 1/4 x 1/4 x 4/4 4/4 x 2/4 = 0.25 x 0.25 x 0.25 x 1 x 1 x 0.5 = 0.0078125
The prob is better for "one" here, but the priors are in favor of "two" so the final numbers are closer for two:
prior("one") = 3/6 = 0.5 prior("two") = 4/6 = 0.66
final "one" = 0.023718420000000004 * 0.5 = 0.011859210000000002 <<- rounding error? final "two" = 0.005208333333333333
Now, I understand your thought process, but adding the extra training instance for "two" means the one in column 6 for "two" is worth 1/3 (1/4 smoothed) while the one in column 2 for "one" is worth 1/2 (1/3 smoothed).
Working without smoothing and ignoring the other columns you get then
one = 2/2 x 1/2 x 2/2 = 0.5 two = 3/3 x 3/3 x 1/3 = 0.33
(one is more likely)
and then adding the priors:
0.5_0.4 = 0.2 0.33_0.66 = 0.2178
two wins by very little.
This is of course not correct because without any smoothing both probabilities are zero! As soon as you add any smoothing that little difference in favor of two disappears.
If you still feel this is an error, feel free to re-open this bug.
I haven't used or looked at this for a few years now, so its all completely gone from my brain. I have a feeling I was mistaken at the time with this case anyway, but thank you for getting back to me. :)
I'm working on reverse engineering the bayes classifier algorithm to better understand how it works under the covers, and am seeing what appears to be inconsistencies with the results from the
probabilityOfClass
andclassify
functions. I have a hunch it may be related to https://github.com/NaturalNode/apparatus/issues/7, but am not sure. Here are some examples showing what I've been seeing.The above code outputs
one
, as I would expect, with the following values:The above code outputs
two
, as I would expect, with the following values:The above code outputs
one
, which is _not_ what I would expect. TheprobabilityOfClass
function assigns the following values for each class:My expectation is that given an array of "observations" where both classes are represented equally, those observations would be a better "match" to the class with more closely related
examples
? In other words,[1,1,1,1,1,1]
has the same amount of "perfect" matches in classone
compared to classtwo
, but it also has more "partial" matches in classtwo
, so why would it be a better "fit" for classone
? Perhaps what we need is some sort of prior probability here?Any clarification, especially if my understanding is flawed (which is likely), would be fantastic.