dritchie / probabilistic-js

Turning Javascript into a probabilistic programming language
Other
58 stars 6 forks source link

Bug in trace.proposeChange: incorrect fwd and rvs probabilities with conditioned continuous values? #11

Closed longouyang closed 9 years ago

longouyang commented 9 years ago

A webchurch user found problems with this model:

(mean
 (mh-query
  1000 1
  (define x (if (flip .5) 1 0))

  x

  (if x
      (= (gaussian  10 1 10) 10)
      (= (gaussian -10 1 10) 10))))

This model tries to figure out whether the observation 10 came from N(-10,1) or N(10,1). The model should should strongly prefer N(10, 1) over N(-10,1) but it currently believes that the two Gaussians are equiprobable. This has to do with its use of a relatively untested webchurch feature: supplying a conditioned-value argument to a continuous ERP, e.g., the third argument to (gaussian 10 1 10).

Noah and I tracked the bug down to these lines in RandomExecutionTrace.prototype.proposeChange:

fwdPropLP += nextTrace.newlogprob
rvsPropLP += nextTrace.oldlogprob

Because MH only proposes changes to x and the kernel for flip always proposes switching to the other value (false to true, true to false), the forward and reverse proposal probabilities for this model should always be 1 but aren't. This is because newlogprob and oldlogprob are including the scores of the conditioned variables.

newlogprob is updated in RandomExecutionTrace.prototype.lookup:

var ll = erp.logprob(val, params)
this.newlogprob += ll
if (conditionedValue == undefined) this.genlogprob += ll

Notice that there is also a genlogprob property that only keeps track of variables that were truly generated.

And oldlogprob is updated in RandomExecutionTrace.prototype.traceUpdate:

// Clear out any old random values that are no longer reachable
for(var i=0,rec; rec = oldvarlist[i]; i++) {
  // before checking to see if the old random value is active,
  // make sure that it was not supplanted by a new value
  // (e.g., change of support)
  if( this.vars[rec.name] == rec && !rec.active) {
    this.oldlogprob += rec.logprob
    delete this.vars[rec.name]
  }
}

My proposed fixes are:

  1. For the forward proposal probability, use genlogprob rather than newlogprob. I'm not positive that this is the right fix because newlogprob is modified in LARJ (so maybe this change would break inference in LARJ?)
  2. Fix the reverse proposal probability by skipping over conditioned ERPs when we compute oldlogprob:
// Clear out any old random values that are no longer reachable
for(var i=0,rec; rec = oldvarlist[i]; i++) {
  // before checking to see if the old random value is active,
  // make sure that it was not supplanted by a new value
  // (e.g., change of support)
  if( this.vars[rec.name] == rec && !rec.active) {
    if (!rec.conditioned) { // proposed fix
      this.oldlogprob += rec.logprob
    }
    delete this.vars[rec.name]
  }
}

I'm more confident about this second fix but it'd be nice to have independent confirmation that this is sane.

Thoughts?

dritchie commented 9 years ago

While I think your suggestions are sane and would work, there's a more principled way to resolve this issue that might prevent bugs like this from popping up in the future. A random choice with a conditioned value really shouldn't be stored as a random choice in the inference routine (i.e. in the MH trace). Instead, it really should just be treated as a factor. That way, we don't have to deal with any special casing for "is this variable conditioned or not, can we propose to it, etc." All of that logic could be removed from the codebase.

longouyang commented 9 years ago

In trying to implement this more principled fix, I came across two hurdles:

  1. There's not a whole lot of conditionedValue-specific logic inside trace.js to begin with. We only do something special for conditionedValue in three places (and the logic isn't too complicated).
  2. We still need to keep track of conditioned variables in some way because MH proposals can cause them to be created / destroyed, as in the example model above. Just treating conditioned variables as factors didn't give me the right behavior.

In the lieu of these hurdles, would it be okay if I committed the bandaid I proposed and add your more general fix to my todo list?

dritchie commented 9 years ago

Huh. Item 2. is surprising; that'd be good to look into, eventually. Sure, I don't mind putting a bandaid on things for now :)

longouyang commented 9 years ago

Okay, bandaid applied in af8358dc7cfe53b410a02c6c7cf19d17e2bdb017