tanhevg / GpABC.jl

MIT License
54 stars 15 forks source link

Negative distance predicted by emulator #74

Open EvaJanouskova opened 2 years ago

EvaJanouskova commented 2 years ago

From my understanding of ABC, it should result in obtaining posteriors with the distance as close to the reference data as possible. However, when using SimulatedABCRejection(), I got many posteriors with distances far from the reference data if the distance is negative. I am afraid, there is an absolute value missing, so e.g. for a threshold of 5.0 it should only accept simulations with abs(distance) <= 5.0, but currently, all simulations with distance <= 5.0 are accepted (which includes all negative distances regardless of how far they are from zero).

tanhevg commented 2 years ago

Which distance function are you using? By definition, a distance function should return a non-negative value for any pair of arguments. That's what the default (euclidean) does. If using a custom distance function, please make sure that it indeed defines a distance.

EvaJanouskova commented 2 years ago

I used the default distance function:

emul_abcsmc_res = @time EmulatedABCSMC( reference_data, simulator_function, priors, threshold_schedule, n_particles, n_design_points; write_progress = false, emulator_retraining = NoopRetraining(), emulated_particle_selection = PosteriorSampledEmulatedParticleSelection() );

tanhevg commented 2 years ago

Ok, this makes sense now: in emulation mode the distances are predicted by the GP. Predictions can indeed turn out negative, for example when the mean is very close to zero, and the variance is high. This could be exacerbated by using the full covariance matrix (use_diagonal_covariance = false). The root causes behind this could be problem-specific; I am afraid I do not have time to debug this now.

Does this work with default particle selection criterion (MeanEmulatedParticleSelection)? Please feel free to provide your own subtype of AbstractEmulatedParticleSelection and override abc_select_emulated_particles for it to use the absolute value of predicted distance.

EvaJanouskova commented 2 years ago

It does work with default particle selection criterion (MeanEmulatedParticleSelection). Thank you!

Since I don't necessarily insist on using PosteriorSampledEmulatedParticleSelection() and since I'm afraid I don't have time to debug this right now either, I'm leaving it open :/, sorry.