nearestvelocity - Githubissues

wyu54 commented 2 years ago

There might be a bug in NearestVelocityPredict:

line 172 self.interpolator = NearestNDInterpolator(positions.values, vels.values)

I think it should be

self.interpolator = NearestNDInterpolator(positions.values[:,::-1], vels.values[:,::-1])

Changing this fixed particle confusion problem in my tracking.

wyu54 commented 2 years ago

Try this data. Nearest-Velocity predict confuses particles in the middle, but is fixed after I made the aforementioned change. testdata.csv

wyu54 commented 2 years ago

Also, link_df does not work for me. link_df_iter works. probably miss a kw['predictor'] = self.predict

in def wrap_single(self, linking_fcn, *args, **kw):

snilsn commented 2 years ago

I can confirm that there is something wrong with the link_df() method of the NearestVelocityPredict class. The results of this method are different (and probably wrong) from those obtained with link_df_iter after changing

self.interpolator = NearestNDInterpolator(positions.values, vels.values)

to

self.interpolator = NearestNDInterpolator(positions.values[:,::-1], vels.values[:,::-1])

I provide my minimal dataset and code below.

This is also visible in different versions of the NearestVelocityPredict section of the Dynamic predictors tutorial:

The version with link_df()` is working here: https://soft-matter.github.io/trackpy/v0.3.2/tutorial/prediction.html

Here it is not, there are no tracks to the particles of the third frame: https://soft-matter.github.io/trackpy/v0.5.0/tutorial/prediction.html

import pandas as pd
import trackpy as tp
import matplotlib.pyplot as plt

features = pd.read_csv('dat.csv')

pred = tp.predict.NearestVelocityPredict(span=1)
traj_1 = pred.link_df(
            features,
            search_range=4,
            pos_columns=["projection_x_coordinate", "projection_y_coordinate"])

df_list = [frame for i, frame in features.groupby('frame')]

traj_2 = pred.link_df_iter(
            df_list,
            search_range=4,
            pos_columns=["projection_x_coordinate", "projection_y_coordinate"])
traj_2 = pd.concat(traj_2)

fig, (ax1, ax2) = plt.subplots(ncols = 2, figsize = (10, 5), sharey=True)

for i, track_i in traj_1.groupby('particle'):
    ax1.plot(track_i.sort_values('frame')['hdim_2'], 
             track_i.sort_values('frame')['hdim_1'],
            label='particle {}'.format(int(i)),
            marker ='^',
            linestyle='-') 
ax1.legend()
ax1.set_title('link_df')

for i, track_i in traj_2.groupby('particle'):
    ax2.plot(track_i.sort_values('frame')['hdim_2'], 
             track_i.sort_values('frame')['hdim_1'],
            label='particle {}'.format(int(i)),
            marker ='^',
            linestyle='-') 
ax2.legend()
ax2.set_title('link_df_iter')
fig.savefig('linking.jpeg')

dat.csv linking

freemansw1 commented 2 years ago

Yeah, I can confirm that predict isn't even called with link_df. Seems like there was a bug introduced by the refactoring that happened after 0.3.2.

nkeim commented 2 years ago

Thanks for these reports! It looks like there's a serious omission in predict.NullPredict.wrap_single(). As you identified, the workaround is to use the link_df_iter() method, which calls the (correct) code in predict.NullPredict.wrap().

I probably can't get to it this week, but the needed steps are

[x] Add a test for link_df()!
[x] Fix wrap_single()
[ ] Release v0.5.1 and make sure documentation is rebuilt.

nkeim commented 2 years ago

Going to back to the original report of transposed coordinates by @wyu54 : It would be very helpful to have a minimal example to reproduce that issue. I suspect that the problem is that predict.NullPredict.wrap_single() defaults to ['x', 'y'] for pos_columns, instead of ['y', 'x'] as would be provided by guess_pos_columns(). But I am reluctant to make that edit if we have no way to test whether it fixes the problem.

For now, the workaround would be to specify pos_columns=['y', 'x'] explicitly when calling link_df_iter().

snilsn commented 2 years ago

Since I ran into the same issue as @wyu54, I designed a small example. Explicitly calling with pos_columns and switching x and y doesn't seem to work, but not specifying pos_columns at all solves it:

import pandas as pd
import trackpy as tp
import matplotlib.pyplot as plt

d = {'frame': [1, 2, 3, 4, 1, 2, 3, 4], 
     'x': [0, 1, 2, 3, 1, 1.5, 2, 2.5], 
     'y': [0, 1, 2, 3, 2.8, 2.3, 1.8, 1.3]}
df = pd.DataFrame(data=d)
df_list = [frame for i, frame in df.groupby('frame')]

fig, (ax1, ax2, ax3) = plt.subplots(ncols=3, figsize = (15, 5))

pred = tp.predict.NearestVelocityPredict()
traj_1 = pred.link_df_iter(
            df_list,
            100)
traj_1 = pd.concat(traj_1)

pred = tp.predict.NearestVelocityPredict()
traj_2 = pred.link_df_iter(
            df_list,
            100,
            pos_columns = ['x', 'y'])
traj_2 = pd.concat(traj_2)

pred = tp.predict.NearestVelocityPredict()
traj_3 = pred.link_df_iter(
            df_list,
            100,
            pos_columns = ['y', 'x'])
traj_3 = pd.concat(traj_3)

ax1.set_title('no pos_columns')
tp.plot_traj(traj_1, plot_style={'marker':'d'}, ax = ax1)
ax2.set_title("pos_columns = ['x', 'y']")
tp.plot_traj(traj_2, plot_style={'marker':'d'}, ax = ax2)
ax3.set_title("pos_columns = ['y', 'x']")
tp.plot_traj(traj_3, plot_style={'marker':'d'}, ax = ax3)

col_input

freemansw1 commented 2 years ago

Just a note that while I wasn't able to reproduce @snilsn's plots, I was able to produce something similar by:

import trackpy
import pandas as pd
import matplotlib.pyplot as plt

d = {'frame': [1, 2, 3, 4, 1, 2, 3, 4], 
     'x': [0, 1, 2, 3, 1, 1.5, 2, 2.5], 
     'y': [0, 1, 2, 3, 3, 2, 1, 0]}
df = pd.DataFrame(data = d)

df_list = [frame for i, frame in df.groupby('frame')]

fig, (ax1, ax2, ax3) = plt.subplots(ncols=3, figsize = (15, 5))

pred = trackpy.predict.NearestVelocityPredict()
traj_1 = pred.link_df_iter(
            df_list,
            100)
traj_1 = pd.concat(traj_1)

pred = trackpy.predict.NearestVelocityPredict()
traj_2 = pred.link_df_iter(
            df_list,
            100,
            pos_columns = ['x', 'y'])
traj_2 = pd.concat(traj_2)

pred = trackpy.predict.NearestVelocityPredict()
traj_3 = pred.link_df_iter(
            df_list,
            100,
            pos_columns = ['y', 'x'])
traj_3 = pd.concat(traj_3)

ax1.set_title('no pos_columns')
trackpy.plot_traj(traj_1, plot_style={'marker':'d'}, ax = ax1)
ax2.set_title("pos_columns = ['x', 'y']")
trackpy.plot_traj(traj_2, plot_style={'marker':'d'}, ax = ax2)
ax3.set_title("pos_columns = ['y', 'x']")
trackpy.plot_traj(traj_3, plot_style={'marker':'d'}, ax = ax3)

0af73224-8610-4274-90c2-8a74b99136fc

freemansw1 commented 2 years ago

Yeah, it looks like ultimately there are two issues here:

Specifying pos_columns breaks link_df_iter, regardless of what you specify.
link_df does not properly wrap link and give it a predictor.

Addressing 1, my rudimentary print statement debugging indicates that the velocities are being calculated correctly by _RecentVelocityPredict._compute_velocities regardless of what pos_columns is specified as. Further, it looks like vels and positions are identical between not specifying pos_columns and setting pos_columns=['x','y']. I still need to dig into this further.

Regardless, it looks like the workaround for now is to use link_df_iter and not specify pos_columns. That seems to produce correct results in our tests and I think that will be what we switch to in tobac.

freemansw1 commented 2 years ago

Continuing my work on pos_columns breaking link_df_iter, it looks like linking.linking.link_df_iter expects the reverse order of pos_columns of the rest of the preceeding functions. When pos_columns is None, guess_pos_columns returns ['y', 'x'], which is actually the opposite of the defaults in the calling functions. The bandaid to this solution would be to reverse pos_columns, but I'm not sure that that is a correct solution, especially given that I haven't tested it for 3D/ND.

I'm going to try to continue to find a solution here for at least the first issue, as I doubt that the second issue can be fully solved without it.

nkeim commented 2 years ago

Thanks for the examples! Testing for these issues and then cleanly fixing them turned out to be quite involved. Please take a look at #710 , and try it out if that's easy for you.

snilsn commented 2 years ago

Solves the problems for all of my cases @nkeim, thanks for your efforts!

soft-matter / trackpy

nearestvelocity #699