Closed germanbarcenas closed 3 years ago
Hi,
in principle there is nothing wrong with your approach. Some hints: As per error message, ndarray
(what you return in your computation) does not have an append
. Instead, you will want to use a sliced assign like kappas[:] = kappa
. Also your idea of the shape being (-1, 1)
looks correct to me.
In general I would say this is almost a functioning feature. :slightly_smiling_face:
Best Moritz
Thank you @clonker. That slice notation was helpful. I am very close, but now I'm dealing with a problem I wonder you might provide 1 more hint to solve (:
So here are my changes:
#%%
t=mdtraj.load(XTC,top=PDB)
r1t=t.xyz[:,r1,:]
s1t=t.xyz[:,s1,:]
r2t=t.xyz[:,r2,:]
s2t=t.xyz[:,s2,:]
kappas=np.zeros((t.n_frames,1))
Long1 = np.subtract(r1t,s1t)
UnitLong1 = Long1/np.linalg.norm(Long1)
UnitLong1Squeeze=np.squeeze(UnitLong1)
Long2 = np.subtract(r2t,s2t)
UnitLong2 = Long2/np.linalg.norm(Long2)
UnitLong2Squeeze=np.squeeze(UnitLong2)
CoM1 = np.add(r1t,s1t)/2
CoM2 = np.add(r2t,s2t)/2
MagCenterToCenter = np.linalg.norm(np.subtract(CoM2,CoM1))
UnitCenterToCenter = np.subtract(CoM2,CoM1)/MagCenterToCenter
UnitCenterToCenterSqueeze=np.squeeze(UnitCenterToCenter)
u1dotu2=np.dot(UnitLong1Squeeze,UnitLong2Squeeze.T)
u1dotd=np.dot(UnitLong1Squeeze,UnitCenterToCenterSqueeze.T)
u2dotd=np.dot(UnitLong1Squeeze,UnitCenterToCenterSqueeze.T)
k=u1dotu2-3*(u1dotd*u2dotd)
kappas[:]=k
and from here I get the following error. In fact its the 1st time I've ever seen this one:
Traceback (most recent call last):
File "R:\GermanBarcenas\dna\gromacs\codes\markovAnalysis.py", line 950, in <module>
u2dotd=np.dot(UnitLong1Squeeze,UnitCenterToCenterSqueeze.T)
File "<__array_function__ internals>", line 5, in dot
MemoryError: Unable to allocate 21.6 GiB for an array with shape (76134, 76134) and data type float32
Which tells me that these dot products are not eating up all of my memory. You might notice my transpose I added to those vectors. I did this because I was getting this before:
Traceback (most recent call last):
File "R:\GermanBarcenas\dna\gromacs\codes\markovAnalysis.py", line 951, in <module>
k = np.dot(UnitLong1Squeeze,UnitLong2Squeeze) - 3*(np.dot(UnitLong1Squeeze,UnitCenterToCenterSqueeze)*(np.dot(UnitCenterToCenterSqueeze,UnitLong2Squeeze)))
File "<__array_function__ internals>", line 5, in dot
ValueError: shapes (76134,3) and (76134,3) not aligned: 3 (dim 1) != 76134 (dim 0)
So now my problem is dealing with the large arrays that are created when processing the whole trajectory. Are there any more efficient strategies in either pyemma
or mdtraj
to work through this problem more efficiently? There must be because dihedral angles also have to be doing some vector manipulation, likely with dot products, and those are features I have used that don't crash memory.
I realize that I wasn't using arrays at all correctly. I have fixed it and now this returns the correct feature. Normally I would just edit out the mistake but I'll leave it up for people to learn from my mistakes (:
def kappa(traj: mdtraj.Trajectory):
r1t=t.xyz[:,r1,:]
s1t=t.xyz[:,s1,:]
r2t=t.xyz[:,r2,:]
s2t=t.xyz[:,s2,:]
kappas=np.zeros((t.n_frames,1))
Long1=[]
Long2=[]
CoM1=[]
CoM2=[]
UnitLong1=[]
UnitLong2=[]
MagCenterToCenter=[]
UnitCenterToCenter=[]
k=[]
for idx,val in enumerate(kappas):
Long1.append(np.subtract(r1t[idx],s1t[idx]))
Long2.append(np.subtract(r2t[idx],s2t[idx]))
UnitLong1.append(Long1[idx]/np.linalg.norm(Long1[idx]))
UnitLong2.append(Long2[idx]/np.linalg.norm(Long2[idx]))
CoM1.append(np.add(r1t[idx],s1t[idx])/2)
CoM2.append( np.add(r2t[idx],s2t[idx])/2)
MagCenterToCenter.append(np.linalg.norm(np.subtract(CoM2[idx],CoM1[idx])))
UnitCenterToCenter.append(np.subtract(CoM2[idx],CoM1[idx])/MagCenterToCenter[idx])
k.append(np.dot(UnitLong1[idx],UnitLong2[idx].T) - 3*(np.dot(UnitLong1[idx],UnitCenterToCenter[idx].T)*(np.dot(UnitCenterToCenter[idx],UnitLong2[idx].T))))
orientationFactor=np.asarray(k).reshape(-1,1)
return orientationFactor
thanks @germanbarcenas! i'll go ahead and close this issue then.
Hello,
Sorry if this is a repeat of similar questions. I notice when people make custom functions, it tends to always end up using some mdtraj analysis with some tweaks, like maybe the distance of center of mass that end up combining existing analysis into something new. I want to make something that is completely new to use in some pyemma code. I can call this value an orientation factor that tells us how molecules will align with respect to another. It involves me selecting some atoms to define vectors that will tell me about the molecules alignment. Here is an example of how that would look like:
Here I've selected the atoms named C9 and C20 of my residues of interest to create the vectors. This works fine. Now this is how the analysis would look like:
This works well in other instances where I might just append these values to an array that looks through the trajectory for each frame. I want to return this as a feature like this:
feat.add_custom_func(kappa,dim=1,description='Orientation Factor')
My question is, how do I return kappa as the ndarray that is needed for pyemma to accept this as a feature? I notice that custom features always have some moment where it invokes the trajectory to make an array of values for each frame. I think me I should be doing this to grab that r1,r2,s1,s2 values. My first thought is to use the
traj.xyz
function from mdtraj, and select those usingtraj.xyz[:,r1,:]
functions, but I'm not really sure how I would use them after. I'm I on the right track? I would image the shape of the kappa feature should be(-1,1)
right, because I should just be getting 1 kappa for each frame. So my first pass at this issue of returning an ndarray looks like this:and here is the error I get:
Have I even started this approach correctly?
Thanks,
pyemma v
2.5.9