Closed sbhakat closed 7 months ago
I assume you're talking about enspara/cluster/save_states.py
?
The unique_trajs
variable looks to me like it stores which trajectories have states that will get saved, not the names. I think that the "state number" (in the output path/to/output/State0-0.pdb
) is the cluster center index. This is @mizimmer90 's code... maybe he wants to weigh in?
What are you trying to do? There may be an easier way!
Justin
Thanks @justinrporter . So I wanted 2000 cluster which I indicated by n_clusters = 2000
and the centers_masses are saved as state000000-00.pdb
... state001999-00.pdb
(2000 of them zero indexed). I am using multiple trajectories which are named as traj_genheavy15000_.xtc .... traj_genheavy17999_1.xtc
. I want to know which state00**-00.pdb
comes from which traj_genheavy*.xtc
? It is enspara/cluster/save_states.py
and I believe an analogous version of the save_states.py
also exits with fast https://github.comcom/bowman-lab/fast/blob/master/msm_gen/save_states.py My script looks like
from fast.msm_gen import save_states as ss
dist_cutoff = 0.15
assignments = ra.load("./data/assignments.h5")
distances = ra.load("./data/distances.h5")
ss.save_states(assignments, distances, save_routine='masses',largest_center=dist_cutoff, n_confs=1, n_procs=64)
I think fast.msm_gen import save_states as ss
is analogous to enspara save_states.py
I see. Your cluster centers should be all the frames where distances == 0
(try ra.where(distances == 0)
to get the list of pairs (trajectory_id, frame_id)
). You can then use cluster.util.load_frames
to load them up into a single trajectory and then save that (I do as an .h5
file... this will be smaller on disk than pdbs and likely easier to work with anyway).
Not exactly what you asked, but maybe gets you where you want to go?
Is there a way to save the cluster centers with ''trj_filename' so that one can identifies which cluster center originates from which trajectory? I guess the following part of the script is doing the saving part
Is it as simple as to change
unique_trajs = np.unique(centers_location['traj_num'])
tounique_trajs = np.unique(centers_location['trj_filename'])
or is there other tricks?