charles-river-analytics / figaro

Figaro Programming Language and Core Libraries
Other
756 stars 151 forks source link

VE output on disconnected graphs #711

Open bruttenberg opened 7 years ago

bruttenberg commented 7 years ago

At the end of VE, all factors are multiplied together and everything is eliminated but the target. When you have a disconnected model though, you end up with an empty factor that could potentially be zero. When this empty factor is multiplied by the non-empty factors, it can zero out the result. A small fix in VE can fix this:

private def makeResultFactor(factorsAfterElimination: MultiSet[Factor[Double]]): Factor[Double] = { // It is possible that there are no factors (this will happen if there are no queries or evidence). // Therefore, we start with the unit factor and use foldLeft, instead of simply reducing the factorsAfterElimination. val nonEmpty = factorsAfterElimination.filter(p => p.variables.nonEmpty) // This line is new nonEmpty .foldLeft(Factory.unit(semiring))(.product()) }

gtakata commented 7 years ago

Not sure I understand this:

1) Why would we want to multiply disconnected models together? It would seem that you would want to do the elimination only on the one that has the target. The other should have no distribution over the target anyway. 2) Shouldn't the product method already take care of this? ie shouldn't two factors have variables in common in order to be multiplied? If so, multiplying by an empty factor should be a no-op.

bruttenberg commented 7 years ago

I agree with all of that, but that’s not how its set up at the moment. For #1 below, correct, we don’t care about the disconnected portion, but we don’t check the graph. We just do eliminations

For #2, yes that would also be a good solution, but it seems our product doesn’t do this.

I have a fix in VE for this, but my sense is that fixing #2 here might be the right solution.

From: Glenn Takata [mailto:notifications@github.com] Sent: Wednesday, May 31, 2017 11:02 AM To: p2t2/figaro figaro@noreply.github.com Cc: Brian Ruttenberg bruttenberg@cra.com; Author author@noreply.github.com Subject: Re: [p2t2/figaro] VE output on disconnected graphs (#711)

Not sure I understand this:

  1. Why would we want to multiply disconnected models together? It would seem that you would want to do the elimination only on the one that has the target. The other should have no distribution over the target anyway.
  2. Shouldn't the product method already take care of this? ie shouldn't two factors have variables in common in order to be multiplied? If so, multiplying by an empty factor should be a no-op.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/p2t2/figaro/issues/711#issuecomment-305214841, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AFJOJbSzdiOVc-d9SDibWoAqSZhxwFu8ks5r_YDvgaJpZM4Nrtz3.

wkretschmer commented 7 years ago

There are definitely other places in the codebase where we compute a product of factors, so a solution that changes the product method is desired.

I don't think we should require factors to have common variables in order to compute their product. But it's unclear to me how an empty factor is treated as the zero factor. Shouldn't we treat it as the unit factor?

gtakata commented 7 years ago

I agree about allowing factors to combine without overlapping variables.

The issue of zero factors is, I think, an artifact of our implementation. For dense factors, at least, we default the factor values to 0. Since an empty factor does not set any values, and the cell index is always 0, the probability will always be 0.

apfeffer commented 7 years ago

[https://tr.cloudmagic.com/h/v6/emailtag/tag/2.0/1496329780/e74138296a56547dcb2f1ea7c28a0a53/2/e51d88f8b33b1071f12df3e3a5852c07/257c7200deeba4bb31df603511f6cdfd/9efab2399c7c560b34de477b9aa0a465/newton.gif] I think the correct solution is to create a unit factor that has no variables and has probability one for the trivial empty assignment. I don't think this is a problem with product, but with summation. When the last variable is summed out of a factor, it should result in the unit factor, not the zero factor.

++++++++++++++++++++++ Dr. Avi Pfeffer Chief Scientist Charles River Analytics, Inc. (617) 491-3474x513 apfeffer@cra.com www.cra.com

On Thu, Jun 1, 2017 at 11:03 AM, Glenn Takata notifications@github.com wrote:

I agree about allowing factors to combine without overlapping variables.

The issue of zero factors is, I think, an artifact of our implementation. For dense factors, at least, we default the factor values to 0. Since an empty factor does not set any values, and the cell index is always 0, the probability will always be 0.

- You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/p2t2/figaro/issues/711#issuecomment-305521698, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AFJkd4WuET4jSVTIiwaxBwZ0mpjKe88Rks5r_tLdgaJpZM4Nrtz3.

bruttenberg commented 6 years ago

I just realized a fix for this never got pushed into Figaro 5. It's sitting on my machine now (the fix described at the top of this issue).

Do we want me to push it into the new 5 dev branch?