Open cmoecke opened 3 years ago
Hello,
Thanks for trying my code! I'd be happy to try and see if I can help get this to work, but I'm currently fairly busy unfortunately. I hope I get around to it sometime next week, so I thought it would be best to drop you a quick message that I've seen your question :)
Best, Sjoerd
Hi,
Thank you very much! That’s really nice of you!! :) Looking forward to your answer, Conrad
Hi,
I just had a quick preliminary look and some questions came to my mind:
"it should be (normally) distributed within the closed interval [0,1]" What does this mean? A normal distribution is, by definition, unbounded. What kind of quantity does f
represent? Is it always bounded between 0 and 1? Or between -1 and 1? In either case, it wouldn't make sense to me to model it with Gaussian noise.
This is a regression-type problem, correct? Did you look at the logistic regression example in the notebook from the repo? I feel like the problem is fairly similar to that.
What is the log likelihood function you want to use? If you can describe that clearly to me, it shouldn't be difficult to punch the problem into the code.
Hey, thanks for the reply!
1.) By that, I meant that the probability distribution of
f
for each lag timedt
should be constrained within (0,1].f
represents an autocorrelation function, obtained from a light scattering experiment (e.g. capturing a bright-field video of beads moving in water). I employ it in the end to calculate a mean-squared-displacement which has the following proportionality:MSD(dt) ~ -Log[f(dt)]
that's why I strictly require the probability distribution to be >0. The constraint of it being <=1 simply says that the MSD should not be negative, as the un-squared displacement is strictly real valued. This is always the case. For the other parameters, I only require them to be >0. So.. perhaps it would be better to assume that they are log-normally distributed?
2.) Unfortunately, I think it is not. It can be, though. If I know the underlaying mechanics of the measurement. That means, e.g. for the most simple case, which would be Brownian motion, the ACF decays like an exponential
Exp[-D*dt]
, whereD
is the diffusion coefficient. However, for more involved measurements, where I am not completely sure, what's going on, I don't want to limit the evaluation at this stage by sticking to some (empirical) model (at least for now). In this case, it is a better choice for me to look theMSD
, mentioned in 1.), and compare these for different measurements, model independently.3.) There is, to my knowledge, no theoretical background on what the probability distributions of these parameters should be exactly. So I am not really sure.. maybe I'd stick to a similar log-likelihood function like stated in my initial post
LogLikelihood[LogNormalDistribution[ai*(1-fi)+bi, dd[[i]]]]
and get the posterior distribution offi
for each lag time step. I am sorry that this is not really precise.
Thanks a lot for looking into this!!
If f
is definitely constrained within [0, 1], I think you should look for a natural way to phrase a model for f
that satisfies this constraint. A log-normal distribution (i.e., the exponential of a normal distribution) would make sense for a strictly positive quantity, but if it's constrained within [0, 1], it would make more sense to model f
as, e.g., the LogisticSigmoid
of some other quantity. So something along the lines of
TransformedDistribution[LogisticSigmoid[...], ...]
Or perhaps as a BetaDistribution
or something like that. In the end, though, I can't tell you how to model your data, of course. All I can say for sure it that the model should fit naturally with the natural constraints of the problem.
One thing you definitely do want to do, is to write a single loglikelihood function for the whole dataset instead of a table of 35 independent ones. If the observations are independent, you should be able to just sum the loglikelihoods together. The function defineInferenceProblem
can take loglikelihood functions directly and in this situation I think that would be the most useful course of action.
Okay, so I will try that and close the issue once I succeed. That's helping me a lot, thank you :)
Dear Sjoerd,
as stated in the title, I want to perform uncertainty quantification for an ACF that I calculate dependent on some lag time
dt
. The ACFf(dt)
is implicitly given bywhere
A
andB
are educated estimations of time independent parameters (i.e. noise and amplitude estimates of a continuous signal) andd(dt)
is some quantity that is calculated for said discrete lag timesdt
; these are the input parameters. Let's say for now that these parameters are independent of each other and normally distributed. Further, I have some prior knowledge aboutf(dt)
, namely that it should be (normally) distributed within the closed interval[0,1]
(same goes forA
andB
). As a first step, in order to estimate the associated uncertainty of the ACF, I did Gaussian propagation of uncertainty obtainingdf(dt)
. Now it might happen that within the frequentist uncertainty margins, this prior knowledge is violated and I am having trouble reasoning putting a hard constraint on it. That's why I thought this would be a great use case of Bayesian statistics. However, I am not quite sure how to implement it properly within this excellent pacelet. My initial idea was to perform the nested sampling for eachdt
step as follows (please also see the notebook attached, where I provide example data and the full code.)where I input the values of
f(dt)
anddf(dt)
from previous calculations and the Gaussian uncertainty propagation, respectively. Here I am not sure, whether it would be possible to do the analysis with the full list ofd
, instead of employingTable
and with that, the calculations could be sped up a bit, since for a full data analysis I would need to run this procedure ~50 times. I then get the new median and .68 quantiles byThe values I get in the end seem plausible to me, though I am unsure whether this approach is actually suited for this problem. Intuitively it makes sense to me, but maybe I am missing something crucial.
Thanks in advance!
UCP_Bayesian.nb.zip