AdityaSavara / PEUQSE

Parameter estimation for complex physical problems often suffers from finding ‘solutions’ that are not physically realistic. The PEUQSE software provides tools for finding physically realistic parameter estimates, graphs of the parameter estimate positions within parameter space, and plots of the final simulation results.
13 stars 5 forks source link

For Kinetics: Use list/arrays/vectors to extend to arbitrarily complex responses #3

Open AdityaSavara opened 4 years ago

AdityaSavara commented 4 years ago

currently, tprequation inside https://github.com/AdityaSavara/ODE-KIN-BAYES-SG-EW/blob/master/tprmodel.py looks like this:

def tprequation(tpr_theta, t, Ea1_mean, Ea2_mean, log_A1_mean, log_A2_mean, gamma_1_mean, gamma_2_mean,dT,dt,start_T):

This equation either needs to be generated by a function writing to a file or needs to have some kind of lists / list expansion.

Can it become like this? def tprequation(tpr_theta, t, tpr_Ea_mean_list, log_A1_mean_list, gamma_1_mean_list, gamma_2_mean_list,dT,dt,start_T): Maybe we need to also add in variable that is the number of species (a constant, like start_T is a constant).

We also need to figure out a way compatible with generalizing the below equations: tpr_theta = odeint(tprequation, [0.5, 0.5], times, args = (post_mean[0], post_mean[1], post_mean[2], post_mean[3], post_mean[4], post_mean[5],dT,dt,start_T)) # [0.5, 0.5] are the initial theta's.

rate = tprequation(tpr_theta, times, post_mean[0], post_mean[1], post_mean[2], post_mean[3], post_mean[4], post_mean[5], dT,dt,start_T) rate_tot = -np.sum(rate, axis=0)

tpr_theta = odeint(tprequation, [0.5, 0.5], self.times, args = (sample[0], sample[1], sample[2], sample[3], sample[4], sample[5],self.dT,self.dt,self.start_T)) # [0.5, 0.5] are the initial theta's. rate = tprequation(tpr_theta, self.times, sample[0], sample[1], sample[2], sample[3], sample[4],

sample[5], self.dT,self.dt,self.start_T) rate_tot = -np.sum(rate, axis=0)

AdityaSavara commented 4 years ago

Implementing arbitrary complexity is going to require us to make a reactants and products array On Nov 25th I wrote to Eric:

Eric, I have another question. For this line: probability_metric = multivariate_normal.pdf(x=rate_tot,mean=self.experiment,cov=self.errors)

If I understand correctly, if we want to generalize to having several response variables (like concentration 1, concentration 2, etc.) , we can just generalize it as follows: Probability_metric_1 = multivariate_normal.pdf(x=rate_tot_1,mean_1=self.experiment_1,cov=self.errors_1) Probability_metric_2 = multivariate_normal.pdf(x=rate_tot_2,mean_2=self.experiment_2,cov=self.errors_2) probability_metric = Probability_metric_1*Probability_metric_2

If so, we can make a probability_metric_array and then…. probability_metric = np.prod(probability_metric_array)

Do you think I understand correctly? Or is it going to be more hard than this to generalize to multiple response variables?

Ashi

AdityaSavara commented 4 years ago

We've already extended the code to handle arbitrary complexity. However, I found that we started getting numerical errors in the posterior where it was returning zero due to multiplying multiple small probabilities together. I was able to mitigate this problem by switching to a scaled covariance matrix. However, it's likely that this problem will happen again. Based on the source of the problem and playing with some examples, the problem can be mitigated by looking at reduced parameter spaces. Accordingly, I a longterm task would be as follows.

1) check if the multivariate pdf comes back as zero, if so then... 2) make an as close to diagonalized representation of the covariance matrix as possible, so that for each parameter we only have the covariances with two other parameters considered. (see attached) 3) consider individual 3x3 matrices for smalelr covariances. apply multivariate pdf to each of those, then multiply what is returned from each of those individual smaller covariance multivariate pdfs together.

Note that the above procedure above will actually result in a smaller float than the true float, but that is fine because a) it still gives a finite value back which is better than zero, b) it will not apply to cases that don't have the above problem. So actually, it heavily penalizes bad solutions without causing numerical errors. Admittedly overpenalizes, but this is only intended to be a compromise to avoid numerical errors. During implementation, we could implement steps 1 and 3 without implementing step 2 until later (as step 1 and 3 already would be an improvement). AlmostDiagonalCovMat