Your method sounds very promising. I have a data set of 750 input coordinates with each approx. 2 million samples. For using amino_fast.py I am running out of 64GB RAM. Using the kde with amino.py would take almost a year on my machine. Hence, I thought I could simply use an histogram instead of kde. So I replaced
# Binning two OP's in 2D space
def d2_bin(self, x, y):
""" Calculate a joint probability distribution for two trajectories.
Parameters
----------
x : np.array
Trajectory of first OP.
y : np.array
Trajcetory of second OP.
Returns
-------
p : np.array
self.bins by self.bins array of joint probabilities from KDE.
"""
KD = KernelDensity(bandwidth=self.bandwidth,kernel=self.kernel)
KD.fit(np.column_stack((x,y)), sample_weight=self.weights)
grid1 = np.linspace(np.min(x),np.max(x),self.bins)
grid2 = np.linspace(np.min(y),np.max(y),self.bins)
mesh = np.meshgrid(grid1,grid2)
data = np.column_stack((mesh[0].reshape(-1,1),mesh[1].reshape(-1,1)))
samp = KD.score_samples(data)
samp = samp.reshape(self.bins,self.bins)
p = np.exp(samp)/np.sum(np.exp(samp))
return p
simply with
# Binning two OP's in 2D space
def d2_bin(self, x, y):
""" Calculate a joint probability distribution for two trajectories.
Parameters
----------
x : np.array
Trajectory of first OP.
y : np.array
Trajcetory of second OP.
Returns
-------
p : np.array
self.bins by self.bins array of joint probabilities from KDE.
"""
hist, _, _ = np.histogram2d(x, y, bins=self.bins)
hist = hist.T
return hist / np.sum(hist)
Is this reasonable for bins=200, or is kde needed for a sensitive distance measure? Thank you for your advice.
Hello,
Your method sounds very promising. I have a data set of 750 input coordinates with each approx. 2 million samples. For using amino_fast.py I am running out of 64GB RAM. Using the kde with amino.py would take almost a year on my machine. Hence, I thought I could simply use an histogram instead of kde. So I replaced
simply with
Is this reasonable for
bins=200
, or is kde needed for a sensitive distance measure? Thank you for your advice.