Closed eric-bzb closed 1 year ago
Hi, thanks! So glad to hear it!
I think maybe it would be a joint probability distribution, Distribution[(A, B)]
or maybe even something like A => Distribution[B]
. Maybe an example would help though?
Sorry for the delay in reply. I think I may not be explaining what I am after properly. I will try again.
Let's say we have two events, event A
and event B
but we do not know if the two events are independent. However, what we do know is the following data: Pr(A)
, Pr(B)
, Pr(A | B)
(which are each just Double
values between 0 and 1), and what we want to find is the joint probability Pr(A & B)
.
By a simple application of the definition of conditional probability, we can solve this by: Pr(A & B) = P(A | B)*Pr(B)
Yet, for some reason I am struggling mapping the above algorithm over to your probability-monad. Perhaps it is that the primitive data structure in your library is Distribution[T]
and an "event" such as event A
is really something like:
val probabilityOfEventA: Double = discreteUniform(List(1,2,3,4,5)).pr(_ < 5)
probabilityOfEventA
above has type Double
but perhaps the correct thing to do in order to implement the above (trivial) algorithm generically is to recast it into a Distribution[Boolean]
val eventA: Distribution[Boolean] = biasedCoin(probabilityOfEventA)
Sorry if this is not making much sense.
OK, I think I see what you mean. Fundamentally, Pr(A | B)
is a function distAgivenB: B => Distribution[A]
, in other words, something that given a particular outcome from B
, knows the distribution of A
s that result from that choice from B
. Then to get a joint distribution, all you would need is a distB: Distribution[B]
, and then do:
distB.flatMap(b => distAgivenB(b).map(a => (a, b))
or equivalently (and much more readable IMHO)
for {
b <- distB
a <- distAgivenB(b)
} yield (a, b)
This maps to your example insofar as Pr(A)
is not needed.
More concretely:
val isRaining: Distribution[Boolean] = tf(0.2)
def isTraffic(isRaining: Boolean): Distribution[Boolean] = if (isRaining) tf(0.8) else tf(0.3)
val joint = for {
r <- isRaining
t <- isTraffic(r)
} yield (r, t)
scala> joint.hist
(false,false) 56.05% ████████████████████████████████████████████████████████
(false,true) 23.73% ███████████████████████▋
(true,false) 4.00% ████
(true,true) 16.22% ████████████████▏
The probability monad is kind of an inside-out version of the way you're probably used to dealing with probability. You're only really dealing with individual samples (that is, the variable a
is a particular choice from the set of possible events A
). For me it's a more concrete way of thinking about it, but it definitely takes getting used to.
Thanks! Yes, I think we are talking about exactly the same thing now. I even had written some code that is very similar to your own:
/**
* Calculate the joint distribution, taking conditional probability
* as primitive (useful if it is not yet known whether the events are independent)
*
* @param distB
* @param aGivenB
* @return
*/
def joint[B](distB: Distribution[B])(aGivenB: B => Distribution[A]): Distribution[(A,B)] = for {
b <- distB
a <- aGivenB(b)
} yield (a,b)
I guess I was simply not believing my own code! :-) Again, thanks for providing the helpful hints and example.
Nice! Glad to help, any time.
This issue can be closed, yes?
I have really enjoyed playing with your probability-monad in scala. Thank you for coding it up! One issue I keep running into is a situation where the data that I am starting from is given in the form of conditional probabilities.
Is a conditional probability distribution
Pr(B | A)
essentially just represented as a functionDistribution[A] => Distribution[B]
?