Scaling of input and output to the inference algorithm

pfeffer90 commented 5 years ago

Hi, thanks for providing this nicely structured inference module.

I would like to use the package to infer parameters from a inter spike interval distribution. When I do inference on a single parameter of the distribution, I get a nice posterior for a parameter that varies between 0 and 1, but for another parameter that varies in larger range 200 to 2000, the posterior is located at smaller values.

Do you have an idea, why this might happen? Could this be a scaling issue, some internal scaling to keep values in a certain range?

A short overview of the code snippets used. Basically, when I mask all but the w parameter, inference seems to work nicely. When doing the same for tau_e, inference goes wrong.

Simulator

# Full ISI distribution with four parameters
number_of_params = 4
duration = 40000
isi_simulator = DistributionBasedISIGenerator(number_of_params, MixedLimitCylceAndFixPointISI, duration)

# Mask all but one parameter, either w or tau_e (here 
tau_lc = 100 # fixed param
sigma_lc = 10 # fixed param
tau_e = 250 # parameter between 200 and 400 
w = 0.5 # parameter between 0 and 1

masked_isi_simulator = MaskedSimulator(sim=isi_simulator, mask=np.array([False, False, False, True]), obs=np.array([w, tau_lc, sigma_lc, tau_e]))

Prior

prior = Uniform(lower= [ 200], upper=[ 400])
# or Uniform(lower= [ 0], upper=[ 1]) when masking all but w

Summary statistics

from summary_stats.isi_stats import ISIStats
our_isi_summary_stats = [np.mean, stats.variation]
s=ISIStats(our_isi_summary_stats, input_is_spike_train=False)

Generator

from delfi.generator import Default
g = Default(masked_isi_simulator, prior, s)

The generated data looks nice in both cases, that is the input is sampled from the given prior and the output lies in the expected range.

params, isi_stats = g.gen(1000)

Basic inference

from delfi.inference import Basic
inf = Basic(generator=g, n_components = 1, n_hiddens = [10])

I then train the network with about 1000 samples and try it on a data point. In the case of masking all but w, I take w=0.5, where I get the expected Gaussian centered close to 0.5. In the case of masking all but tau_e, I feed in tau_e = 300 and I get a Gaussian that is far off centered at 0.

dgreenberg commented 5 years ago

Could you provide code for your simulator, or some other minimal working example that reproduces the issue? We need to reproduce this issue in order to solve it.

dgreenberg commented 5 years ago

Closing due to lack of activity.

Please feel free to re-open with a minimal working example that can reproduce this bug.

pfeffer90 commented 5 years ago

Hi @dgreenberg, sorry for the delay, I cleaned up our repro. I attached it infer_sn_vs_hom-master.zip (unfortunately our institute gitlab prevents making projects public). In the notebook basic_inference_mixture_ISI_on_tau_e, I reproduced the issue I describe above. For comparison, see the notebook basic_inference_mixture_ISI_on_the_mixing_factor, where the inference seems to work nicely. Thanks for having a look

dgreenberg commented 5 years ago

OK, I'm reopening this and we will have a look as soon as possible.

jan-matthis commented 5 years ago

My guess is that the reason is that tau_e is on a different scale. I would try z-scoring the parameters before using them to train the inference network (and back-transform afterwards).

jan-matthis commented 5 years ago

Closing due to inactivity -- feel free to reopen if and when continuing to work on this

dgreenberg commented 5 years ago

I actually think this is pretty much resolved due to fixes for prior_norm and the new transformed distribution.

mackelab / delfi