aimalz / proclam

PRObabilistic CLAssification Metrics for PLAsTiCC
Other
12 stars 7 forks source link

Paper Feedback Round 2: @DoctorLobster #64

Closed aimalz closed 5 years ago

aimalz commented 5 years ago

The notation in Equation 1 looks funny to me.

Usually when you say a random variable x is a draw from, say, a standard Gaussian, it is written as

x ~ N(0,1)

so the right hand size is a name of a random variable.

In your equation 1, the RHS is an algebraic expression, and is not the pdf of a Dirichlet distribution. I think you're trying to write the random number generation algorithm on the RHS, but perhaps you've mixed up the Gamma distribution with the gamma function? (the latter is a normalising constant for the former).

Also if what is drawn jointly is a vector of random variables, the vector should be bold-faced, i.e. \mathbf{p}. I would just use the symbol Dir[ ] and I recommend something like.

Let \mathbf{p} be a vector of class probabilities, such that each element p_m \equiv p( m | d_n, D, C). Then we draw \mathbf{p} from a Dirichlet distribution on the probability simplex:

\mathbf{p} \sim \text{Dir}[ \mathbb{C}_{m'_n} \delta ]

On average \mathbb{E}[\mathbf{p}] = \mathbb{C}_{m'_n}, and the variance of the perturbations is set by \delta.

If necessary, you can write in a footnote that algorithmically , you draw a probability vector from Dir[ \mathbf{\alpha} ], by drawing independently from a Gamma distribution each m

q_m ~ Gamma( \alpha_m, 1 )

and then setting p_m = q_m / sum(q). I would use Gamma rather than \Gamma to describe a gamma random variable or density, to avoid confusing it with the related gamma function, which uses \Gamma.

aimalz commented 5 years ago

@DoctorLobster You're totally right -- I went on autopilot and put \Gamma where it should have been Gamma, and none of that was really necessary because the code uses scipy's built-in Dirichlet function anyway, so I took out the whole mess and changed it to Dir[]. I also made p(m | m', C) explicitly a vector for clarity.