Closed PeptideSimulator01 closed 5 months ago
Apologies for the delayed response. I'm sure you've moved on from this by now, but leaving this as a note for anyone else who may be curious.
One of the easiest ways to figure out where a cluster center is coming from is to provide --center-indices indices.npy
(or whatever you want the file to be named) as an input argument for the clustering app. The resulting file will be a 2D array of length clusters with values: trajectory #, frame #. To deconstruct this, glob
the trajectories as the clustering app does and map the trajectory numbers in your center-indices array back to the trajectories of interest.
The easiest way to figure out which frames went to which cluster centers is to parse through the assignments file. In this case, assignments is a ragged array of shape (n_trajectories, n_frames), with each value being the cluster center that frame was assigned to. As before, you can match the assignments file trajectories to the original trajectories by comparing to the same glob
pattern as the clustering app does.
Dear all,
I try to clusters 50 trajectories each from 2 different peptides (both with 14 aa). I only wright the C, CA and N atoms in the trajectory to be able to clusters them together. For sure, I also only cluster based on RMSD of C, CA and N. After the clustering I want to count which cluster was used how many times by peptide 1 or 2 and hopefully see a different distribution.
What I do:
If I then track back which frame from which peptide was used for cluster I have the feeling, that the first clusters are build by the first 50 trajectories and the last 5 clusters by the the 50 trajectories od peptide2. This is also represented by the centroid structures. First 5 are peptide1, last 5 peptide2. Am I making a horrible mistake with this?
Any hint is aprreciated, thanks for your effort.
The code I used: