Hi,
in the paper it says: "a linear output layer that produces μi(s) and Σi (s) for each primitive".
However, in the formula to calculate the resulting distribution you have to divide by the variance. So a zero variance would be a problem. And I don't know if a negative variance makes sense?
Hi, in the paper it says: "a linear output layer that produces μi(s) and Σi (s) for each primitive". However, in the formula to calculate the resulting distribution you have to divide by the variance. So a zero variance would be a problem. And I don't know if a negative variance makes sense?