Open vivarose opened 8 years ago
Thanks for you request! I think that a simple error analysis can be based on the effective number of independent measurements, which is already calculated by msd
. However, if you start to include tracking uncertainties, particle correlations etc., the error estimation becomes more difficult and very much user-specific.
I am not in favor of including those extra effects. A simple estimation would be fine, if we are clear about the simplified error model we use.
I do not have the time to look into this, but I can help you if you want to contribute to trackpy yourself!
I will definitely think about contributing to trackpy because I do think this would be helpful for me and for the community.
:smile: This would be a welcome addition! I agree that it should not be turned on by default, and it should get more than a few words in the documentation. Otherwise the possibility for a non-expert (i.e., me) to misinterpret the result could be too great.
My plan is to edit the emsd function so that it returns a standard deviation. This standard deviation shows how the single particle MSDs vary. Very simple, no subtleties. Two initial questions (which can be changed later anyway):
I am strongly :-1: on kwargs changing the number/type of returned values. If you need to break the API on msd
, then either just break it and document it or (and I know @danielballan will disagree about this) make a new function (msd_err
?, I am bad at naming things).
I would go with the numpy naming of std
.
Welcome, @vivarose!
@tacaswell A reminder here that we already have this detail=True
flag that @vivarose mentioned in (1). If I was doing emd
over again, I would implement emd_detailed
as a separate function. But I think (1) is probably the right way to go here with respect to consistency with the existing code. If we want to rethink that API, we should do it for all the detailed columns, in a separate pull request.
I agree: I like std
for consistency with numpy -- notwithstanding its unfortunate name collision with common English usage.
Hi,
Thank you for the encouragement! I hope this will be useful for the community. I find that MSD plots look deceptively smooth because the datapoints are correlated with each other, so having the ability to plot errorbars will certainly be useful to me.
I added a few lines to the function tp.motion.emsd() so that it now calculates and returns the standard deviation. It's available here: https://github.com/vivarose/trackpy
Please let me know if I should change anything or edit any of the documentation.
Here are the lines I added to the code:
# Calculation of biased weighted standard deviation
numerator = ((msds.subtract(results))**2).mul(msds['N'], axis=0).sum(level=1)
denominator = msds['N'].sum(level=1) # without Bessel's correction
variance = numerator.div(denominator, axis=0)
variance = variance[['<x>', '<y>', '<x^2>','<y^2>','msd']]
std = np.sqrt(variance)
std.columns = 'std_' + std.columns
return results.join(std)
And I am changing the function's comment to indicate the new columns:
Returns
-------
Series[msd, index=t] or, if detail=True,
DataFrame([<x>, <y>, <x^2>, <y^2>, msd, N, lagt,
std_<x>, std_<y>, std_<x^2>, std_<y^2>,
std_msd],
index=frame)
Best, Viva
Closing through #352
@vivarose I am reopening this issue, because we did not merge your PR. Let's keep this issue open until there is a new PR created.
My PI Vinny has a classroom student working on a Bayesian approach to the error analysis, which may supersede my pull-request.
Any update on whether this will be merged anytime soon?
I don’t believe there is anything to merge yet.
Rather than plotting error bars, I decided to publish plots where I showed all of the individual particle MSDs (faintly) to show the variation, and the ensemble MSD (bolded) to show the weighted average. This visualizes the variation without requiring a calculation of the standard error.
The publication is available: https://link.springer.com/article/10.1140/epjst/e2019-800026-y Free: https://arxiv.org/abs/1806.05760 See figures 5A and 6A.
My code for analysis and plotting is available: https://github.com/vivarose/using_trackpy
In particular: https://github.com/vivarose/using_trackpy/blob/master/2018%20EPJST%20figures/EPJ%20paper%20(Janus%2C%20green)%20-%20Figure%205A.ipynb https://github.com/vivarose/using_trackpy/blob/master/2018%20EPJST%20figures/EPJ%20paper%20(tracer%2C%20red)%20-%20Figure%206A.ipynb
You are welcome to use my code and please cite the publication if you do!
If you are looking to calculate the standard deviation of the MSD, you are welcome to use my code, https://github.com/horowitz-lab/using_trackpy/blob/master/stdev_emsd.py
I would like to see errorbars on the ensemble mean square displacement plot. There are a number of subtleties that could be taken into account. (For example, the coordinates that Trackpy assigns to each particle will have some error. For another example, the single particle MSD error must be calculated with the awareness that the data being averaged are not independent.)
The easiest way to do it for an ensemble MSD is to calculate the standard deviation when taking the average of the single-particle trajectories.
I think that calculating and plotting errorbars should be built in, if not the default setting.