stripe / rainier

Bayesian inference in Scala.
https://rainier.fit
Apache License 2.0
433 stars 51 forks source link

Add proper `Half` versions of continuous dists #478

Open avibryant opened 4 years ago

avibryant commented 4 years ago

Rather than using abs which can lead to identifiability problems.

antoniakon commented 4 years ago

Hello, I am implementing a two-way Anova and I want to use Horseshoe prior for variable selection, therefore I need a half-Cauchy distribution (C+(0,1)). I tried to define my own using the following code:

/**
    * A Continuous Distribution that inherits its transforms from a Support object.
    * Based on private[rainier] trait StandardContinuous
    */
  trait StandardContinuous extends Continuous {
//    val support = UnboundedSupport
    val support = BoundedBelowSupport(Real.zero)

    def param: RandomVariable[Real] = {
      val x = new Variable

      val transformed = support.transform(x)

      val density = support.logJacobian(x) + logDensity(transformed)

      RandomVariable(transformed, density)
    }
  }

  object halfCauchy {
    def apply(location: Real, scale: Real): Continuous =
      standardHC().scale(scale).translate(location)

    def standardHC(): StandardContinuous = new StandardContinuous {

      override def logDensity(v: Real): Real =
        (((v * v) + 1) * Math.PI).log * -1 + log(2) //Over the half

      override def generator: Generator[Double] = Generator.from { (r, _) =>
        //classic for sampling from Cauchy to take the ratio of 2 standard normals. take the abs for positive values
        math.abs(r.standardNormal / r.standardNormal)
      }
    }
  }

I compare the posterior densities to jags but they do not seem quite right. Also the expectations are either not correct or, when similar to the expected, the density looks weird (1st image). However, when I use the val support = UnboundedSupport (I know that this is not right and it's the bounded that we should use) the results look ok (2nd image). In both cases I am running HMC(300) with warm-up 10000 and 1 million iterations. I am also using version 0.2.2, but I imagine it will be similar in 0.3.x. Do you have any suggestions why this is giving me incorrect results? Thank you. BoundedHorseshoeRainier UnboundedHorseshoeRainier

avibryant commented 4 years ago

@antoniakon thanks for the report. I'll try to reproduce and figure out what's going on.

In the meantime, it will probably work fine for you to just use Cauchy(0,1) followed by abs. It can throw off diagnostics because you end up with some chains choosing to put the parameters in positive space and others in negative space, which inflates the between-chain variance, but the plots should work out fine.

antoniakon commented 4 years ago

Thank you

avibryant commented 4 years ago

@antoniakon can you show me the full code for the model that produced those density plots? (Since they don't seem to be plots for a standard half-cauchy).

I can't find anything wrong with your code, and when I use equivalent code in 0.3.2 I get reasonable looking results. I'm currently running a SBC test on a HalfCauchy implementation that I'll submit a PR for if it completes successfully.

antoniakon commented 4 years ago

@avibryant thank you for your interest. Please check your email.