Questions about using the pair_correlation_2d function

stevenvanuytsel commented 3 years ago

Hi, Would it be possible to write a short tutorial about how to use the pair_correlation_2d function? In my specific case I'd like to find the radial distribution function between particles in two different colour channels. I've tracked the particles in both channels, concatenated them into one dataframe (renaming the particles so there are no duplicates), and then passed the particle identifiers from one colour channel to p_indices. Is this roughly how it's supposed to go?

stevenvanuytsel commented 3 years ago

Hi, I've tried using the pair_correlation_2d function and I have a question about the histogram values I get returned. I've shown an example calculation below. I am looping over all frames in a stack to find the RDF of how the particles of type 'SM' or distributed around the particle of type 'pore'. Thus, for each frame I run tp.static.pair_correlation_2d to compute the RDF between pore and SMs. The bin values that are returned are 0 for most of the bins (as I'd expect due to the small amount of particles, and I basically just need to show that there's no correlation between pore and SM location), but some bins return very large numbers, which I don't fully understand how to interpret. I realise that this is a consequence of my particle density being practically zero, but I don't understand how to interpret it or what to do with it. A second question is: after looping over all frames, can I just add all the corresponding histogram values together to get the global rdf over my entire stack?

Thanks for your help!

frame_df y x mass size ecc ... raw_mass ep frame particle type 0 50.053230 77.985694 26282.0 4.738248 0.012912 ... 26282.0 NaN 900 3 pore 1 52.008451 31.895462 6390.0 3.523496 0.028124 ... 6390.0 NaN 900 276 SM 2 22.127063 53.976568 6060.0 3.496863 0.007996 ... 6060.0 NaN 900 277 SM

r_edges, g_r = tp.static.pair_correlation_2d(frame_df, 50, dr=.5, p_indices=[0])

g_r = [ 0. 0. 0. 0. 0. 0.

1. 1. 1. 1. 0.
1. 1. 1. 1. 0.
1. 1. 1. 1. 0.
1. 1. 1. 1. 0.
1. 1. 1. 1. 0.
1. 1. 1. 1. 0.
1. 1. 1. 1. 0.
1. 1. 1. 1. 0.
1. 1. 1. 1. 0.
1. 1. 1. 1. 0.
1. 1. 1. 1. 0.
40.9244348 0. 0. 0. 0.
1. 1. 1. 1. 0.
1. 1. 1. 1. 0.
1. 49.11748366 0. 0. 0.
1. 1. 1. ]

MaartenBransen commented 3 years ago

The radial distribution function gives a histogram of pairwise distances, normalized to the average number of pairwise distances in a randomly distributed (uncorrelated) system at the same density. On average, because your density is so low, you expect to find very few particle pairs (<<1) in each bin. So for every bin where you do happen to have a particle pair, it gives a high value because finding 1 particle pair is still much higher than what would be expected on average at that density. In other words, you're seeing the effects of not sampling enough particles to get a good distribution.

Because the radial distribution function is normalized (such that a value of 1 means no correlation), you don't want to just sum them together. Provided the density remains roughly constant, you can take the average over many timesteps and get a better result.

soft-matter / trackpy

Questions about using the pair_correlation_2d function #638