choderalab / bayesian-itc

Python tools for the analysis and modeling of isothermal titration calorimetry (ITC) experiments.
GNU General Public License v3.0
5 stars 10 forks source link

What determines initial log_sigma guess? #1

Closed bas-rustenburg closed 9 years ago

bas-rustenburg commented 9 years ago

Right now, it seems to be the log of the standard deviation in the last 4 injection heats. The assumption would be that the last four injections are all the same, since they're dilution/mechanical heats (, if I interpret this correctly). Is there any way to select that number, "4", in a more intelligent fashion? Or even estimate the noise differently, like deviation from a baseline model?

https://github.com/choderalab/bayesian-itc/blob/master/python/models.py#L188

Since it's just an initial guess for the noise, it might not be worth spending time on optimizing this. I'm just curious about possible approaches.

jchodera commented 9 years ago

A much better way to do this is to allow the user to specify some "calibration" files that contain just water-water injections.

Failing that, maybe we just set the initial sigma based on the magnitude of the smallest injection heat?

bas-rustenburg commented 9 years ago

So that assumes the noise for a water-water titration is the same as for complex mixture such as a protein in solution, with different buffers, heat capacity et cetera. Though... still probably! a better guess.

The magnitude of the smallest injection... how would you translate that to a noise/variance measure?

jchodera commented 9 years ago

We should definitely explore other error models. Joel Tellinghuisen explores several of these in this paper. See Eq. 10 for the list of possible models.

We could potentially use Bayesian model selection schemes to decide upon the appropriate model.

jchodera commented 9 years ago

The magnitude of the smallest injection... how would you translate that to a noise/variance measure?

We'd presume that the magnitude of the smallest injection is probably within 1-2 orders of magnitude of the actual error size, so it's a decent starting guess in the absence of other information.

jchodera commented 9 years ago

The noise is probably also proportional to the number of time-filtered samples integrated for each injection heat. For example, if we had uncorrelated Gaussian error $\sigma1$ in each 2-second sample, integrating $N$ of these samples by summing would give an error with standard deviation $\sigma{int} = \sqrt(N)*\sigma_1$.

This can be important when the injection durations are different.

bas-rustenburg commented 9 years ago

Looking back at this code, are we taking the logarithm of something that has a unit (microcals)? Should we be normalizing this/canceling out the unit?

jchodera commented 9 years ago

Yeah, we should first divide by some reference unit, such as 1 ucal.

bas-rustenburg commented 9 years ago

Closing in favor of #27.