tubiana / TTClust

clusterize molecular dynamic trajectories (amber, gromacs, charmm, namd, pdb...)
Other
104 stars 36 forks source link

error while doing the clustering #20

Closed hima111997 closed 3 years ago

hima111997 commented 3 years ago

i installed ttclust using anaconda and ran it used this command:

command used: ttclust -f "step5_production_tri_mut.dcd" -t "step3_charmm2namd.pdb"

but it gave me this error

output:


** TTCLUST 4.8.2 *****


======= TRAJECTORY READING ======= ====== Clustering ======== creating distance matrix NOTE : Extraction of subtrajectory for time optimisation Interactive mode disabled. I will use the saved matrix

Distance Matrix File Loaded! Matrix shape: (250, 250) Scipy linkage in progress. Please wait. It can be long ERROR : method name given for clustering didn't recognized : methods are : single; complete; average; weighted; centroid; ward. : check https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy. cluster.hierarchy.linkage.html for more info

tubiana commented 3 years ago

Hi ! Sorry for my delay. The error seems to come from the scipy linkage calculation. Unfortunately due to a try/except statement I put here the error code is not clear... I replaced it and now the error with scipy should be more clearer. Could you please update to 4.8.3 and run it again? It will not fix the error but I may be more able to help with the real error message :-)

Best, Thibault.

hima111997 commented 3 years ago

Thanks for your reply and sorry for my late reply. I updated it and run it using the same command and it gave me this error.


** TTCLUST 4.8.3 *****


======= TRAJECTORY READING ======= ====== Clustering ======== creating distance matrix NOTE : Extraction of subtrajectory for time optimisation |>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>| Time: 0:00:08 |<<<<<<<<<<<<<<<<<<<<<<<<<<<<<| Calculation ended - saving distance matrix Saving distance matrix : backbone.npy Matrix shape: (250, 250) Scipy linkage in progress. Please wait. It can be long Traceback (most recent call last): File "D:\games\ana\envs\ttclust_env\Scripts\ttclust-script.py", line 9, in <mo dule> sys.exit(main()) File "D:\games\ana\envs\ttclust_env\lib\site-packages\ttclust\ttclust.py", lin e 1311, in main traj = Cluster_analysis_call(args) File "D:\games\ana\envs\ttclust_env\lib\site-packages\ttclust\ttclust.py", lin e 1238, in Cluster_analysis_call distances, clusters_labels, linkage, cutoff = create_cluster_table(traj, arg s) File "D:\games\ana\envs\ttclust_env\lib\site-packages\ttclust\ttclust.py", lin e 706, in create_cluster_table linkage = sch.linkage(distances, method=args["method"]) File "D:\games\ana\envs\ttclust_env\lib\site-packages\scipy\cluster\hierarchy. py", line 1057, in linkage raise ValueError("The condensed distance matrix must contain only " ValueError: The condensed distance matrix must contain only finite values.

tubiana commented 3 years ago

Hi!

This is very strange... It's the first time that I see this error.. This function should take the pairwise RMSD Distance matrix. Maybe there is a nan in this matrix causing this error. In that case, that may be due to

  1. The atom selection (the default selection is the backbone of the protein)
  2. Trajectory itself.

The distance matrix should not be big and since it contains just the RMSD between each frame, it should not be a privacy issue.. If it is okay for you, could you attach it here or send it to me? It's the file called backbone.npy.

Best, Thibault.

hima111997 commented 3 years ago

backbone.zip

this is the backbone.npy file. However, since i dont remember where are the files that produced this issue, I tried to cluster another protein using the same command and uploaded its backbone.npy

this is the error produced this time:


** TTCLUST 4.8.3 *****


======= TRAJECTORY READING ======= ====== Clustering ======== creating distance matrix NOTE : Extraction of subtrajectory for time optimisation |>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>| Time: 0:00:01 |<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<| Calculation ended - saving distance matrix Saving distance matrix : backbone.npy Matrix shape: (120, 120) Scipy linkage in progress. Please wait. It can be long Traceback (most recent call last): File "/usr/local/bin/ttclust", line 8, in sys.exit(main()) File "/usr/local/lib/python3.6/dist-packages/ttclust/ttclust.py", line 1311, in main traj = Cluster_analysis_call(args) File "/usr/local/lib/python3.6/dist-packages/ttclust/ttclust.py", line 1238, in Cluster_analysis_call distances, clusters_labels, linkage, cutoff = create_cluster_table(traj, args) File "/usr/local/lib/python3.6/dist-packages/ttclust/ttclust.py", line 706, in create_cluster_table linkage = sch.linkage(distances, method=args["method"]) File "/usr/local/lib/python3.6/dist-packages/scipy/cluster/hierarchy.py", line 1061, in linkage raise ValueError("The condensed distance matrix must contain only " ValueError: The condensed distance matrix must contain only finite values.

I have another question. which file should I use as an input to the -t command? PSF or PDB file?

I am using CHARMM-GUI to produce the necessary files for dynamics for NAMD

tubiana commented 3 years ago

I see what happened, but I don't know why...

The diagonal of the distance matrix should be 0, but I see very strange values like 2.34950230e+251 in the 5 last frames.

I never saw that and I'm not sure how is that possible... Did you look your MD with VMD to check that nothing is strange to your protein backbone?

For the format, PSF and PDB should be good... But if the PDB has an issue, maybe you should try with the PSF as well.

hima111997 commented 3 years ago

I tried with the psf and it worked well Sent from Yahoo Mail on Android

On الاثنين, يناير ٢٥, ٢٠٢١ at ١١:١٢ ص, Thibault Tubiananotifications@github.com wrote:

I see what happened, but I don't know why...

The diagonal of the distance matrix should be 0, but I see very strange values like 2.34950230e+251 in the 5 last frames.

I never saw that and I'm not sure how is that possible... Did you look your MD with VMD to check that nothing is strange to your protein backbone?

For the format, PSF and PDB should be good... But if the PDB has an issue, maybe you should try with the PSF as well.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

tubiana commented 3 years ago

Happy to ear that then =D Sorry it took so long to just say "use the PSF" in the end... But I can update my readme with this 'known issue' now :-)

Best regards, Thibault.