joshuachristie / timeseries-inference

Inferring parameters of evolutionary models from allele frequency data
0 stars 0 forks source link

train models on "incorrect" causal model structure and "error prone" data #7

Closed joshuachristie closed 3 years ago

joshuachristie commented 3 years ago

I want to show that one does not need to perfectly specify the causal model in order to make (relatively) accurate inferences. Likewise if there's some sort of systematic (or random) error with regard to data collection.

(Idea here being to show that one could apply my approach to a real-world biological study in which you are estimating the causal model---rather than knowing its structure----and in which your measurements will not be perfect.)

joshuachristie commented 3 years ago

another possible variant here is to do a form of model selection--e.g. say that I don't know which of two possible models generated the training data. I could train competing models (with the difference being the outputs of the hypothesised model structure in the cost function). You'd then "choose" the model with the lowest error with respect to fixation probability.

However, I'm not sure this even makes sense. How would I deal with the parameters and fixation probabilities for the test data (imagining that I was actually trying to do this in the field)? In theory, I could estimate the parameters of a hypothesised model without much issue, but what about fixation probabilities? I can't directly measure these but since one needs these in order to evaluate the model (relative to a competing model), I don't think that model selection could work using this approach. I'd need to be doing something like comparing the relative likelihood of the data given each model, but I can't see how this would be useful if applied to a single stochastic time series (unless the competing causal models generate time series data with drastically different statistical properties)

joshuachristie commented 3 years ago

While I think this gets at an interesting idea, I won't have time to implement it anymore