Open JianiC opened 3 years ago
mds <-data %>% dist() %>% cmdscale(k=2)
where distance is calculated with euclidean distances between the rows ( shortest distance between two point)Now, I treat the similarity proportion as a correlation matrix where corr=sqrt(data) dist = sqrt(2*(1-corr))
where the percentage of shared variance is represented by the square of the correlation coefficient, r2
I could at least observe the gradual evolution of the T-cell immunity now, similar to Smith paper Next: Try with RSV, also ask for comments with these calculations
Simplify the dist calculation : dist=1-similarity
still can not separate all of the cluster, but seems to be helpful
Hmm - can you add a fourth dimension? Year of isolation?
Sent from my iPhone
On Nov 2, 2020, at 4:14 PM, JianiC notifications@github.com wrote:
[EXTERNAL SENDER - PROCEED CAUTIOUSLY]
3 D map cluster
still can not separate all of the cluster, but seems to be helpful [Screen Shot 2020-11-02 at 4 11 23 PM]https://user-images.githubusercontent.com/47227610/97919820-72fbcf00-1d26-11eb-8786-3c6f1a1d783d.png
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/JianiC/RSV_Epitope/issues/5#issuecomment-720728131, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADSPRN5AYBL53DQBNWQRHFTSN4OLPANCNFSM4SGC3WKA.
with n*n comparasion, the distance was just calculate with 1- similarity, to correct what I have done it is not mds, it is just PCA Maybe for my RSV research, try to use the clade represent sequence in RSV ??? Here, the cross-immunity to each sequence was take as a features, and the relative distance between each strains were calculated using euclidean distance algorithm, (minimal length between each point) locations of the vaccine strain were added
K-means: to further evaluate the quality of the clustering
Density based clustering rationale: k-means: severely affected by the presence of noise and outliers in the data. But for MDS, classification should be used K-means, because k-means is also based on euclidean distance
Ancestral Sequence reconstruction
Benefits for Beast https://groups.google.com/g/beast-users/c/P4_buh3u_5A
a pilot test with Smith.et.,al data
Do not really observed monophylatic cluster on the phylogenetic tree, but seems there are some genetic diversity accumulate within the immune-cluster???