Closed wyu54 closed 1 year ago
Try this data. Nearest-Velocity predict confuses particles in the middle, but is fixed after I made the aforementioned change. testdata.csv
Also, link_df does not work for me. link_df_iter works. probably miss a kw['predictor'] = self.predict
in def wrap_single(self, linking_fcn, *args, **kw):
I can confirm that there is something wrong with the link_df()
method of the NearestVelocityPredict
class. The results of this method are different (and probably wrong) from those obtained with link_df_iter
after changing
self.interpolator = NearestNDInterpolator(positions.values, vels.values)
to
self.interpolator = NearestNDInterpolator(positions.values[:,::-1], vels.values[:,::-1])
I provide my minimal dataset and code below.
This is also visible in different versions of the NearestVelocityPredict section of the Dynamic predictors tutorial:
The version with link_df()` is working here: https://soft-matter.github.io/trackpy/v0.3.2/tutorial/prediction.html
Here it is not, there are no tracks to the particles of the third frame: https://soft-matter.github.io/trackpy/v0.5.0/tutorial/prediction.html
import pandas as pd
import trackpy as tp
import matplotlib.pyplot as plt
features = pd.read_csv('dat.csv')
pred = tp.predict.NearestVelocityPredict(span=1)
traj_1 = pred.link_df(
features,
search_range=4,
pos_columns=["projection_x_coordinate", "projection_y_coordinate"])
df_list = [frame for i, frame in features.groupby('frame')]
traj_2 = pred.link_df_iter(
df_list,
search_range=4,
pos_columns=["projection_x_coordinate", "projection_y_coordinate"])
traj_2 = pd.concat(traj_2)
fig, (ax1, ax2) = plt.subplots(ncols = 2, figsize = (10, 5), sharey=True)
for i, track_i in traj_1.groupby('particle'):
ax1.plot(track_i.sort_values('frame')['hdim_2'],
track_i.sort_values('frame')['hdim_1'],
label='particle {}'.format(int(i)),
marker ='^',
linestyle='-')
ax1.legend()
ax1.set_title('link_df')
for i, track_i in traj_2.groupby('particle'):
ax2.plot(track_i.sort_values('frame')['hdim_2'],
track_i.sort_values('frame')['hdim_1'],
label='particle {}'.format(int(i)),
marker ='^',
linestyle='-')
ax2.legend()
ax2.set_title('link_df_iter')
fig.savefig('linking.jpeg')
Yeah, I can confirm that predict
isn't even called with link_df
. Seems like there was a bug introduced by the refactoring that happened after 0.3.2.
Thanks for these reports! It looks like there's a serious omission in predict.NullPredict.wrap_single()
. As you identified, the workaround is to use the link_df_iter()
method, which calls the (correct) code in predict.NullPredict.wrap()
.
I probably can't get to it this week, but the needed steps are
link_df()
!wrap_single()
Going to back to the original report of transposed coordinates by @wyu54 : It would be very helpful to have a minimal example to reproduce that issue. I suspect that the problem is that predict.NullPredict.wrap_single()
defaults to ['x', 'y']
for pos_columns
, instead of ['y', 'x']
as would be provided by guess_pos_columns()
. But I am reluctant to make that edit if we have no way to test whether it fixes the problem.
For now, the workaround would be to specify pos_columns=['y', 'x']
explicitly when calling link_df_iter()
.
Since I ran into the same issue as @wyu54, I designed a small example. Explicitly calling with pos_columns
and switching x and y doesn't seem to work, but not specifying pos_columns
at all solves it:
import pandas as pd
import trackpy as tp
import matplotlib.pyplot as plt
d = {'frame': [1, 2, 3, 4, 1, 2, 3, 4],
'x': [0, 1, 2, 3, 1, 1.5, 2, 2.5],
'y': [0, 1, 2, 3, 2.8, 2.3, 1.8, 1.3]}
df = pd.DataFrame(data=d)
df_list = [frame for i, frame in df.groupby('frame')]
fig, (ax1, ax2, ax3) = plt.subplots(ncols=3, figsize = (15, 5))
pred = tp.predict.NearestVelocityPredict()
traj_1 = pred.link_df_iter(
df_list,
100)
traj_1 = pd.concat(traj_1)
pred = tp.predict.NearestVelocityPredict()
traj_2 = pred.link_df_iter(
df_list,
100,
pos_columns = ['x', 'y'])
traj_2 = pd.concat(traj_2)
pred = tp.predict.NearestVelocityPredict()
traj_3 = pred.link_df_iter(
df_list,
100,
pos_columns = ['y', 'x'])
traj_3 = pd.concat(traj_3)
ax1.set_title('no pos_columns')
tp.plot_traj(traj_1, plot_style={'marker':'d'}, ax = ax1)
ax2.set_title("pos_columns = ['x', 'y']")
tp.plot_traj(traj_2, plot_style={'marker':'d'}, ax = ax2)
ax3.set_title("pos_columns = ['y', 'x']")
tp.plot_traj(traj_3, plot_style={'marker':'d'}, ax = ax3)
Just a note that while I wasn't able to reproduce @snilsn's plots, I was able to produce something similar by:
import trackpy
import pandas as pd
import matplotlib.pyplot as plt
d = {'frame': [1, 2, 3, 4, 1, 2, 3, 4],
'x': [0, 1, 2, 3, 1, 1.5, 2, 2.5],
'y': [0, 1, 2, 3, 3, 2, 1, 0]}
df = pd.DataFrame(data = d)
df_list = [frame for i, frame in df.groupby('frame')]
fig, (ax1, ax2, ax3) = plt.subplots(ncols=3, figsize = (15, 5))
pred = trackpy.predict.NearestVelocityPredict()
traj_1 = pred.link_df_iter(
df_list,
100)
traj_1 = pd.concat(traj_1)
pred = trackpy.predict.NearestVelocityPredict()
traj_2 = pred.link_df_iter(
df_list,
100,
pos_columns = ['x', 'y'])
traj_2 = pd.concat(traj_2)
pred = trackpy.predict.NearestVelocityPredict()
traj_3 = pred.link_df_iter(
df_list,
100,
pos_columns = ['y', 'x'])
traj_3 = pd.concat(traj_3)
ax1.set_title('no pos_columns')
trackpy.plot_traj(traj_1, plot_style={'marker':'d'}, ax = ax1)
ax2.set_title("pos_columns = ['x', 'y']")
trackpy.plot_traj(traj_2, plot_style={'marker':'d'}, ax = ax2)
ax3.set_title("pos_columns = ['y', 'x']")
trackpy.plot_traj(traj_3, plot_style={'marker':'d'}, ax = ax3)
Yeah, it looks like ultimately there are two issues here:
pos_columns
breaks link_df_iter
, regardless of what you specify. link_df
does not properly wrap link
and give it a predictor.Addressing 1, my rudimentary print statement debugging indicates that the velocities are being calculated correctly by _RecentVelocityPredict._compute_velocities
regardless of what pos_columns
is specified as. Further, it looks like vels
and positions
are identical between not specifying pos_columns
and setting pos_columns=['x','y']
. I still need to dig into this further.
Regardless, it looks like the workaround for now is to use link_df_iter
and not specify pos_columns
. That seems to produce correct results in our tests and I think that will be what we switch to in tobac.
Continuing my work on pos_columns
breaking link_df_iter
, it looks like linking.linking.link_df_iter
expects the reverse order of pos_columns
of the rest of the preceeding functions. When pos_columns
is None
, guess_pos_columns
returns ['y', 'x']
, which is actually the opposite of the defaults in the calling functions. The bandaid to this solution would be to reverse pos_columns
, but I'm not sure that that is a correct solution, especially given that I haven't tested it for 3D/ND.
I'm going to try to continue to find a solution here for at least the first issue, as I doubt that the second issue can be fully solved without it.
Thanks for the examples! Testing for these issues and then cleanly fixing them turned out to be quite involved. Please take a look at #710 , and try it out if that's easy for you.
Solves the problems for all of my cases @nkeim, thanks for your efforts!
There might be a bug in NearestVelocityPredict:
line 172 self.interpolator = NearestNDInterpolator(positions.values, vels.values)
I think it should be
self.interpolator = NearestNDInterpolator(positions.values[:,::-1], vels.values[:,::-1])
Changing this fixed particle confusion problem in my tracking.