scalanlp / breeze

Breeze is/was a numerical processing library for Scala.
https://www.scalanlp.org
Apache License 2.0
3.44k stars 692 forks source link

Failure in breeze.stats.distributions.Gamma.mle #11

Closed tjhunter closed 12 years ago

tjhunter commented 12 years ago

I encountered this case that fails: can you confirm?

import breeze.stats.distributions.Gamma
    val mean = 2.834312
    val meanOfLogs = -0.689661
    val n=5.000000
    val ss = Gamma.SufficientStatistic(n, meanOfLogs, mean)
    val (k, theta) = Gamma.mle(ss)
dlwh commented 12 years ago

Thanks, I'll look into it tonight. I've noticed that the Gamma MLE is kind of flaky, maybe I've got the objective function wrong.

dlwh commented 12 years ago

Could I get you to compare Gamma.scala:188 and thereabouts with the k/theta pdf here : http://en.wikipedia.org/wiki/Gamma_distribution

I'm pretty sure it's right ( I just double checked), but another pair of eyes would be good.

tjhunter commented 12 years ago

No rush! There is one thing I do not understand in Gamma.scala: why do you need solve a 2d problem using BFGS? Based on wikipedia, you could solve for k first and then get theta?

I had written the following implementation of the Gamma MLE, before realizing it is a duplicate effort with what you have implemented and switching to scalanlp / breeze:

object GammaLearn {
    import org.apache.commons.math.special.Gamma._

  def mle(ss: GammaNLP.SufficientStatistic) = {
    val s = math.log( ss.mean ) - ss.meanOfLogs
    assert(s > 0 , s) // check concavity
    val k_approx = approx_k(s)
    assert(k_approx > 0 , k_approx)
    val k = Nwt_Rph_iter_for_k(k_approx, s)
    val theta = ss.mean / (k)
    (k, theta)
  }
  /*
   * s = log( x_hat) - log_x_hat
   */
  def approx_k(s:Double) : Double = {
    // correct within 1.5%
    (3 - s + math.sqrt( math.pow((s-3),2) + 24*s )) / (12 * s)
  }

  def Nwt_Rph_iter_for_k(k:Double, s:Double ): Double = {
    /*
     * For a more precise estimate, use Newton-Raphson updates
     */
    val k_new = k - (math.log(k) - digamma(k) - s)/( 1.0/k - trigamma(k) )
    if (math.abs(k - k_new)/math.abs(k_new) < 1.0e-4)
      k_new
    else 
      Nwt_Rph_iter_for_k(k_new,s)
  }
}
dlwh commented 12 years ago

laziness, mostly.

I'd happily take a pull request, or I can just patch it in.

-- David

On Sat, Sep 1, 2012 at 10:24 PM, Timothy Hunter notifications@github.comwrote:

No rush! There is one thing I do not understand in Gamma.scala: why do you need solve a 2d problem using BFGS? Based on wikipedia, you could solve for k first and then get theta?

I had written the following implementation of the Gamma MLE, before realizing it is a duplicate effort with what you have implemented and switching to scalanlp / breeze:

object GammaLearn { import org.apache.commons.math.special.Gamma._

def mle(ss: GammaNLP.SufficientStatistic) = { val s = math.log( ss.mean ) - ss.meanOfLogs assert(s > 0 , s) // check concavity val k_approx = approx_k(s) assert(k_approx > 0 , k_approx) val k = Nwt_Rph_iter_for_k(k_approx, s) val theta = ss.mean / (k) (k, theta) } /* * s = log( x_hat) - log_xhat / def approx_k(s:Double) : Double = { // correct within 1.5% (3 - s + math.sqrt( math.pow((s-3),2) + 24_s )) / (12 * s) }

def Nwt_Rph_iter_for_k(k:Double, s:Double ): Double = { /* * For a more precise estimate, use Newton-Raphson updates */ val k_new = k - (math.log(k) - digamma(k) - s)/( 1.0/k - trigamma(k) ) if (math.abs(k - k_new)/math.abs(k_new) < 1.0e-4) k_new else Nwt_Rph_iter_for_k(k_new,s) }}

— Reply to this email directly or view it on GitHubhttps://github.com/scalanlp/breeze/issues/11#issuecomment-8219173.

dlwh commented 12 years ago

ok, completely unrelatedly, I was debugging a problem in my lbfgs routine (turns out I was updating the history in the wrong order, sigh), and that fixed this problem too. I added a unit test, and I'm going to integrate your function. Let me know if that's not ok.

tjhunter commented 12 years ago

No worry, I will send a pull request myself with my cleaned-up mle function tomorrow. Something (unrelated but really helpful for me) would be to cross compile breeze to scala 2.9.1 . Do you want me to open another feature request?

Tim

On Sun, Sep 2, 2012 at 1:27 PM, David Hall notifications@github.com wrote:

ok, completely unrelatedly, I was debugging a problem in my lbfgs routine (turns out I was updating the history in the wrong order, sigh), and that fixed this problem too. I added a unit test, and I'm going to integrate your function. Let me know if that's not ok.

— Reply to this email directly or view it on GitHubhttps://github.com/scalanlp/breeze/issues/11#issuecomment-8225792.

Timothy Hunter Ph.D Student Computer Science University of California - Berkeley www.eecs.berkeley.edu/~tjhunter/ T. 404 421 3075

dlwh commented 12 years ago

2.9.2 is binary compatible with 2.9.1, just use literal version numbers in sbt.

I already committed your code, assuming you'd approve.

-- David

On Sun, Sep 2, 2012 at 5:12 PM, Timothy Hunter notifications@github.comwrote:

No worry, I will send a pull request myself with my cleaned-up mle function tomorrow. Something (unrelated but really helpful for me) would be to cross compile breeze to scala 2.9.1 . Do you want me to open another feature request?

Tim

On Sun, Sep 2, 2012 at 1:27 PM, David Hall notifications@github.com wrote:

ok, completely unrelatedly, I was debugging a problem in my lbfgs routine (turns out I was updating the history in the wrong order, sigh), and that fixed this problem too. I added a unit test, and I'm going to integrate your function. Let me know if that's not ok.

— Reply to this email directly or view it on GitHub< https://github.com/scalanlp/breeze/issues/11#issuecomment-8225792>.

Timothy Hunter Ph.D Student Computer Science University of California - Berkeley www.eecs.berkeley.edu/~tjhunter/ T. 404 421 3075

— Reply to this email directly or view it on GitHubhttps://github.com/scalanlp/breeze/issues/11#issuecomment-8227566.

tjhunter commented 12 years ago

Haha, sure. It is under apache 2.0/public domain license.

On Sun, Sep 2, 2012 at 5:15 PM, David Hall notifications@github.com wrote:

2.9.2 is binary compatible with 2.9.1, just use literal version numbers in sbt.

I already committed your code, assuming you'd approve.

-- David

On Sun, Sep 2, 2012 at 5:12 PM, Timothy Hunter notifications@github.comwrote:

No worry, I will send a pull request myself with my cleaned-up mle function tomorrow. Something (unrelated but really helpful for me) would be to cross compile breeze to scala 2.9.1 . Do you want me to open another feature request?

Tim

On Sun, Sep 2, 2012 at 1:27 PM, David Hall notifications@github.com wrote:

ok, completely unrelatedly, I was debugging a problem in my lbfgs routine (turns out I was updating the history in the wrong order, sigh), and that fixed this problem too. I added a unit test, and I'm going to integrate your function. Let me know if that's not ok.

— Reply to this email directly or view it on GitHub< https://github.com/scalanlp/breeze/issues/11#issuecomment-8225792>.

Timothy Hunter Ph.D Student Computer Science University of California - Berkeley www.eecs.berkeley.edu/~tjhunter/ T. 404 421 3075

— Reply to this email directly or view it on GitHub< https://github.com/scalanlp/breeze/issues/11#issuecomment-8227566>.

— Reply to this email directly or view it on GitHubhttps://github.com/scalanlp/breeze/issues/11#issuecomment-8227585.

Timothy Hunter Ph.D Student Computer Science University of California - Berkeley www.eecs.berkeley.edu/~tjhunter/ T. 404 421 3075