Closed apfeffer closed 8 years ago
I have validated this exception is still thrown in the current version of Figaro 4.0.
Ok, I took a look at this. There are several problems going on here:
Actually, I think the BP issue is the same problem, but since this is not a sampling algorithm it never generates the parameterized elements. I'm not 100% sure though.
Edit: Confirmed, same problem. When I make the parameterized elements permanent (ie, not inside the If) it passes. Still not sure how to fix this yet for BP.
So, this is not fixable for BP. In generalized EM, we select the targets (parameterizable elements) from the active elements before the algorithm starts. Since, in this case, these elements live inside a chain, they do not exist yet. And since they don't exist, they are not added to the targets list of BP, and since BP expands outward from the target, nothing is expanded and no learning is performed.
Basically, this is almost a violation of Figaro semantics - you can't have the target of an algorithm be a temporary variable (this technically does not violate it because it's part of the If, which does call by name).
I see no way to fix this besides moving the elements out of the If.
Hmm. I think the problem is that we’re conflating the concept of a target (as in a variable we query) and an element that contributes to sufficient statistics. EM is implemented in such a way that we make these elements targets, but that’s not really correct. There’s no semantic reason not to use a parameter inside a chain – it’s not a violation of Figaro semantics. Rather, our insistence that such parameters be treated as targets is the violation.
I suggest we put a rethink of generalized EM on the agenda. In the meantime, perhaps we should remove EMWithBP until we can get this right.
From: bruttenberg [mailto:notifications@github.com] Sent: Friday, February 5, 2016 2:52 PM To: p2t2/figaro figaro@noreply.github.com Cc: Avi Pfeffer apfeffer@cra.com Subject: Re: [figaro] Generalized EM crashes or produces wrong answer (#540)
So, this is not fixable for BP. In generalized EM, we select the targets (parameterizable elements) from the active elements before the algorithm starts. Since, in this case, these elements live inside a chain, they do not exist yet. And since they don't exist, they are not added to the targets list of BP, and since BP expands outward from the target, nothing is expanded and no learning is performed.
Basically, this is almost a violation of Figaro semantics - you can't have the target of an algorithm be a temporary variable (this technically does not violate it because it's part of the If, which does call by name).
I see no way to fix this besides moving the elements out of the If.
— Reply to this email directly or view it on GitHubhttps://github.com/p2t2/figaro/issues/540#issuecomment-180526096.
Yes, I agree that treating the parameters as targets is the problem here. For the sampling algorithms, this works because on successive iterations you are likely to generate the "targets", though could still be a problem if your targets have very small priors.
I'm not sure removing EMWithBP completely is a good idea, since it still works fine for most things. But maybe we should discourage its use for the time being?
Well, VE works because we made a special factor type for tracking sufficient statistics. We don't have a version of BP that uses those factors currently, though. Maybe creating one is the solution, and probably easier after the factor unification Brian did.
By the way, we also have issue #543, which is to create a trait for Marginal MAP algorithms. Alex's presentations and paper mention that EM is a marginal MAP algorithm. Maybe when we define this trait, it will help us find a better way of specifying parameters//MAP target variables
You have pretty much seen the marginal map interface. Except for the .map file it looks like the other solvers, with a .uai and an .evi file. There is nothing special in the parameters for EM.
From: mhoward2718 [mailto:notifications@github.com] Sent: Monday, February 8, 2016 1:13 PM To: p2t2/figaro figaro@noreply.github.com Subject: Re: [figaro] Generalized EM crashes or produces wrong answer (#540)
By the way, we also have issue #543https://github.com/p2t2/figaro/issues/543, which is to create a trait for Marginal MAP algorithms. Alex's presentations and paper mention that EM is a marginal MAP algorithm. Maybe when we define this trait, it will help us find a better way of specifying parameters//MAP target variables
— Reply to this email directly or view it on GitHubhttps://github.com/p2t2/figaro/issues/540#issuecomment-181442683.
I wonder if it’s possible to use sufficient statistics factors in sampling algorithms over factors. Put differently, can we make all our factored algorithms work with any Semiring? If so, we could make EM work in a SFI framework.
From: mhoward2718 [mailto:notifications@github.com] Sent: Monday, February 8, 2016 12:53 PM To: p2t2/figaro figaro@noreply.github.com Cc: Avi Pfeffer apfeffer@cra.com Subject: Re: [figaro] Generalized EM crashes or produces wrong answer (#540)
Well, VE works because we made a special factor type for tracking sufficient statistics. We don't have a version of BP that uses those factors currently, though. Maybe creating one is the solution, and probably easier after the factor unification Brian did.
— Reply to this email directly or view it on GitHubhttps://github.com/p2t2/figaro/issues/540#issuecomment-181438787.
It should be possible since factor operations are all in terms of sum and product (however defined). If we can delegate these to the semiring and insure that only the semiring handles them then using a Semiring trait with suitable class instantiations should work. I may be misremembering but it seemed that, at one time, semirings also carried supplemental information, eg semirings for decisions, that were used differently. This might be problematic.
From: apfeffer [mailto:notifications@github.com] Sent: Monday, February 8, 2016 3:10 PM To: p2t2/figaro figaro@noreply.github.com Cc: Glenn Takata gtakata@cra.com Subject: Re: [figaro] Generalized EM crashes or produces wrong answer (#540)
I wonder if it’s possible to use sufficient statistics factors in sampling algorithms over factors. Put differently, can we make all our factored algorithms work with any Semiring? If so, we could make EM work in a SFI framework.
From: mhoward2718 [mailto:notifications@github.com] Sent: Monday, February 8, 2016 12:53 PM To: p2t2/figaro figaro@noreply.github.com<mailto:figaro@noreply.github.com> Cc: Avi Pfeffer apfeffer@cra.com<mailto:apfeffer@cra.com> Subject: Re: [figaro] Generalized EM crashes or produces wrong answer (#540)
Well, VE works because we made a special factor type for tracking sufficient statistics. We don't have a version of BP that uses those factors currently, though. Maybe creating one is the solution, and probably easier after the factor unification Brian did.
— Reply to this email directly or view it on GitHubhttps://github.com/p2t2/figaro/issues/540#issuecomment-181438787.
— Reply to this email directly or view it on GitHubhttps://github.com/p2t2/figaro/issues/540#issuecomment-181510247.
Added a warning when running BP and EM
I am finding severe problems with generalized EM. Again, this is in 3.3 – please check if this is fixed in 4.0.
The following program using EM with VE runs correctly:
package exercises
import com.cra.figaro.patterns.learning. import com.cra.figaro.library.atomic.continuous.Beta import com.cra.figaro.patterns.learning.ParameterCollection import com.cra.figaro.language.Flip import com.cra.figaro.library.compound.If import com.cra.figaro.algorithm.learning. import com.cra.figaro.algorithm.factored.VariableElimination import com.cra.figaro.algorithm.sampling.ProposalScheme
object ex133 { val params = ModelParameters()
val xParam = Beta(1, 1)("x", params) val yGivenXParam = Beta(2, 1)("yGivenX", params) val yGivenNotXParam = Beta(1, 2)("yGivenNotX", params) val zGivenYParam = Beta(1, 1)("zGivenY", params) val zGivenNotYParam = Beta(1, 1)("zGivenNotY", params)
class Model(pc: ParameterCollection) { val x = Flip(pc.get("x")) val y = If(x, Flip(pc.get("yGivenX")), Flip(pc.get("yGivenNotX"))) val z = If(y, Flip(pc.get("zGivenY")), Flip(pc.get("zGivenNotY"))) }
for { i <- 1 to 10 } { val xz = scala.util.Random.nextBoolean() val model = new Model(params.priorParameters) model.x.observe(xz) model.z.observe(xz) }
val time0 = System.currentTimeMillis()
val learningAlg = EMWithMH(10, 1000, ProposalScheme.default, params) learningAlg.start() val time1 = System.currentTimeMillis()
println("Time: " + ((time1 - time0) / 1000.0))
val futureModel = new Model(params.posteriorParameters) futureModel.x.observe(true) println(VariableElimination.probability(futureModel.z, true))
def main(args: Array[String]) {
} }
However, importance sampling, MH, and BP all fail. MH and BP both crash with the exception below. Importance sampling takes an inordinately long time (about 500 times longer than one EM iteration) and returns the incorrect answer of 0.5, which is the prior probability. Does anyone have an explanation for this?
Exception in thread "main" java.lang.ExceptionInInitializerError at exercises.ex133.main(ex133.scala) Caused by: com.cra.figaro.algorithm.BaseProbQueryAlgorithm$NotATargetException at com.cra.figaro.algorithm.BaseProbQueryAlgorithm$class.check(ProbQueryAlgorithm.scala:101) at com.cra.figaro.algorithm.BaseProbQueryAlgorithm$class.distribution(ProbQueryAlgorithm.scala:113) at com.cra.figaro.algorithm.sampling.OneTimeMetropolisHastings.distribution(MetropolisHastings.scala:445) at com.cra.figaro.algorithm.learning.GeneralizedEM$$anonfun$doExpectationStep$1$$anonfun$apply$1.apply(GeneralizedEM.scala:167) at com.cra.figaro.algorithm.learning.GeneralizedEM$$anonfun$doExpectationStep$1$$anonfun$apply$1.apply(GeneralizedEM.scala:163) at scala.collection.mutable.HashSet.foreach(HashSet.scala:78) at com.cra.figaro.algorithm.learning.GeneralizedEM$$anonfun$doExpectationStep$1.apply(GeneralizedEM.scala:163) at com.cra.figaro.algorithm.learning.GeneralizedEM$$anonfun$doExpectationStep$1.apply(GeneralizedEM.scala:160) at scala.collection.immutable.List.foreach(List.scala:381) at com.cra.figaro.algorithm.learning.GeneralizedEM.doExpectationStep(GeneralizedEM.scala:160) at com.cra.figaro.algorithm.learning.ExpectationMaximization$class.iteration(GeneralizedEM.scala:75) at com.cra.figaro.algorithm.learning.GeneralizedEM.iteration(GeneralizedEM.scala:148) at com.cra.figaro.algorithm.learning.ExpectationMaximization$class.em(GeneralizedEM.scala:64) at com.cra.figaro.algorithm.learning.GeneralizedEM.em(GeneralizedEM.scala:148) at com.cra.figaro.algorithm.learning.ExpectationMaximization$class.doStart(GeneralizedEM.scala:36) at com.cra.figaro.algorithm.learning.GeneralizedEM.doStart(GeneralizedEM.scala:148) at com.cra.figaro.algorithm.Algorithm$class.start(Algorithm.scala:83) at com.cra.figaro.algorithm.learning.GeneralizedEM.start(GeneralizedEM.scala:148) at exercises.ex133$.(ex133.scala:39)
at exercises.ex133$.(ex133.scala)
... 1 more