probcomp / Gen.jl

A general-purpose probabilistic programming system with programmable inference
https://gen.dev
Apache License 2.0
1.79k stars 160 forks source link

"Soft" choicemap contraints in generate & MCMC? #231

Open collinskatie opened 4 years ago

collinskatie commented 4 years ago

Hi, I'm trying to use MCMC to infer the objects in a scene that would yield certain observations but am having trouble with conditioning on these observations. We are trying to allow some deviation in our simulated trajectory from the exact observations. Is there a way for Gen to enforce "soft" constraints and still infer the object placement that would give simulated values "close to" those of the observations. From my reading of Gen's documentation, it seems that choicemaps in generate and MCMC implementation are strictly enforced? Is there a way to condition on observations in a way similar to "observe" in WebPPL?

Likewise, when creating a choicemap, it seems that observations need to be directly matched up with particular addresses in the forward function - is there a way to handle matching up values in the generator with observations through choicemap if these have different lengths? For instance, if tracking an objects' motion and trying to condition on its position, if the true object is observed for fewer time points. Is there additional flexibility in how to handle observations vs. sampled values during iterated inference? I'm very new to Gen, so apologies if these are simple questions or I'm misunderstanding something.

Thanks for any help!!

ztangent commented 4 years ago

Hey, thanks for using Gen! In response to your first question, yes you're right that constraints are strictly enforced, which can lead to issues if your observed variables x are some deterministic function f(z) of the latent variables z. The standard way to address this in Gen is to add some small amount (Gaussian) noise around the variable x, by drawing from the distribution noisy_x ~ normal(x, 0.1) and then condition on observing noisy_x instead. There's an example of this in the Gen quickstart tutorials. You can think of this as adding observation noise, or alternatively you can think of it as a technical trick that allows you to soften deterministic constraints. I'm not sure what WebPPL is doing under the hood, but it's likely that they're doing something similar.

ztangent commented 4 years ago

As for matching up addresses in the choicemap with the addresses in the generative function, yes, you need to make sure that every address in the choicemap is present in the generative function, or you might wind up with errors. But as long as you know which timesteps you observed your data at, this shouldn't be an issue. For example, if you observed some :x at timesteps 2, 3, 5 and 8 (and if the generative function assigns them addresses of the form i => :x accordingly, which you'd get if you use the Map or Fold combinators), then you can construct the choicemap as:

choicemap((2 => :x, obs_x[2]), (3 => :x, obs_x[3]), (5 => :x, obs_x[5]), (8 => :x, obs_x[8]))

If you don't know which timesteps you observed the data -- well, trying to figure out which observations match up with which generated values would be an inference problem in itself. You could try to find the optimal matching between the generated and observed and assume that those give the correct timesteps. If you wanted to be more Bayesian about it, you could possibly define a noise model that takes a generated sequence and randomly drops some of the timepoints, and then do inference over that -- but that might be overkill, and I'm not sure if inference is tractable when the length of the sequences gets really large (because there are tons of ways you can drop some timesteps to produce a shorter length sequence).

collinskatie commented 4 years ago

Thanks for your responses @ztangent! I have a couple of follow-up questions:

1) Is it possible to have a custom scoring for a custom trace - for instance, have a custom likelihood in metropolis-hastings?

2) Is there an equivalent of a null/None value that can be set for an observation in a choicemap/@ trace?

For context, in our problem we have set of observations (x,y coordinates of a ball), but the ball can be occluded (so we would have no x,y information) and the ball can come in and out of occlusion. In our generative world, however, we have full access to the x,y position of the ball at all times. The observations and simulated trajectory have the same time points, but we are trying to match these up and have a custom scoring in the case of the ball being occluded in the observation trajectory (where we could then have "access" to both the simulated values that we usually set in @ trace and our observations from the choicemap)

That's where were thinking it'd be nice to have a way to set occluded coordinates equal to a null value, but weren't sure whether this would break the requirement of keeping the same support? If that's not possible, we were thinking we could set the constraints[:traj => i => :x] = None (as an example, where i is the index into our trajectory) or a similar value and then handle it accordingly in a custom scoring function.

Thanks for any help!!

alex-lew commented 4 years ago

Hi @collinskatie!

Here is one way you could think about modeling your scenario:


# A helper generative function that takes in the true x
# and the true y, and samples an observation.
@gen function observe_location(true_x, true_y)
  # Check where x and y are, and whether this would likely lead to a failure
  # to observe the data.
  prob_occluded = calculate_prob_occluded(environment, true_x, true_y)

  # Generate whether we actually do fail to observe the data due to occlusion
  occluded ~ bernoulli(prob_occluded)

  # If we do observe the data, there is probably still some noise
  if !occluded
    x ~ normal(true_x, measurement_noise)
    y ~ normal(true_y, measurement_noise)
  end
end

@gen function model()
  # Sample an initial location and observation
   x = {:traj => 0 => :x} ~ initial_x_prior()
   y = {:traj => 0 => :y} ~ initial_y_prior()
   {:observations => 0} ~ observe_location(x, y)

   # Run the dynamics
   for i=1:T
      # (This is super simplified -- you might have a more detailed "time step"
      #  model for how to sample the next point in the trajectory.)
      x = {:traj => i => :x} ~ normal(x + velocity_x, movement_noise)      
      y = {:traj => i => :y} ~ normal(x + velocity_y, movement_noise)
      # Observe the latest point (or not, if occluded)
      {:observations => i} ~ observe_location(x, y)
  end
end

Then, your observations would either have the form

constraints[:observations => i => :occluded] = true

(if occluded), or:

constraints[:observations => i => :occluded] = false
constraints[:observations => i => :x] = 5.23
constraints[:observations => i => :y] = 10

This is equivalent to a particular choice of "custom likelihood" that you could use for observations; and any other probabilistically coherent choice of custom likelihood should also have some representation as a generative model in Gen. Happy to help think through how to represent the likelihood you have in mind (if there's a particular one you'd like to represent) in Gen, if you want to provide some more details! :-)

collinskatie commented 4 years ago

Hi @alex-lew - thanks for your response!! That sounds like a great idea to use a helper generative function - I'll try that out!!

A few more details about our particular problem - the ball we are following sometimes goes behind a rectangular occluder and then comes back out, so based on the x,y position in this space, we would know whether it's occluded or not (and this occluder stays covers the same defined space over all trials), so the calculate_prob_occluded(x,y) would be deterministic just based on where the ball is. But we could try adding some noise there and then the noise later in the if statement as "perceptual noise"? Also for some more context, we get out the ball position by calling to a physics engine from within Gen for each time point.

I will certainly try out your suggestions, though! One other small point that we were concerned about when modelling is the case of the simulator thinking that the ball should be visible, but the observations having no data there (i.e. :occluded = True like you have above). In this case, we were thinking of scoring our trajectory for that point on a comparision of x,y positions between the simulated trajectory and treating the observed ball as if it were "at the point under the occluder that's closest to our simulated point, as if it were just about to become unoccluded" - so that we sort of have an "optimism" bias and can still look at the "distance" between the believed and observed points. Hopefully that made sense, but if not I can try to provide more details and clarify further --- going off of that though, I'm curious if you think an idea like that could be modelled here w/ the above framework and helper generative function? Or if it would need a different kind of "custom likelihood" or custom definition of Gen.score (if that's possible?)

Thank you so much for your help!