mjhajharia / transforms

2 stars 1 forks source link

Add Stan models for positive definite transforms #31

Closed sethaxen closed 1 year ago

sethaxen commented 2 years ago

Adds Stan models with the following 2 transforms for positive definite matrices:

The transforms and derivations of the log Jacobian determinants are already in the paper. For now I have placed the transforms in a model with a target Wishart distribution. @mjhajharia what's a good approach for the transforms to be reused for the other target distributions listed in #15?

Relates #15

mjhajharia commented 2 years ago

at the moment we're using a function in stan and just using it as a dictionary of sorts that gives out the logprob term for a given distribution. so once the transform's stan file is merged I'll create that. don't put the distribution in the model section for now

mjhajharia commented 2 years ago

btw @sethaxen I was cleaning up the paper and in the augmented-softmax section I find the notation a little unclear, is x_ the whole vector or -ve? this is dumb but i often get very confused with notation. also ig, should we be mentioning the paper's example you thought of this from ?

sethaxen commented 2 years ago

at the moment we're using a function in stan and just using it as a dictionary of sorts that gives out the logprob term for a given distribution. so once the transform's stan file is merged I'll create that.

Cool! I'm interested to see an example because I don't know how to do that in Stan.

don't put the distribution in the model section for now

Done!

There's at least one more positive definite transform to add, but it needs to wait for Stiefel transforms, so I'll add it in a separate PR. PSD matrices will also be handled separately, even though there may be some code duplication.

btw @sethaxen I was cleaning up the paper and in the augmented-softmax section I find the notation a little unclear, is x_ the whole vector or -ve? this is dumb but i often get very confused with notation.

Not dumb at all! Notation is hard, and I'll refine the notation a few times in the end. In this case, x_ is defined in the softmax (non-augmented) section as all but the last entry of x. But I think @bob-carpenter's suggestion of adopting a more programmer-friendly notation should be adopted in the two softmax derivations (e.g. here x[1:n-1]), though that may not work well in later sections (e.g. Stiefel).

also ig, should we be mentioning the paper's example you thought of this from ?

I don't think so. The route from that paper to this transform was convoluted, and where we ended up (softmax) is better motivated just by mentioning softmax. But if in the results we find it helps to explain results, then we could explain that geometric intuition. (unconstrained space -> positive orthant of sphere -> simplex) and mention that paper (already cited elsewhere).