Open sarahmanderni opened 10 years ago
Yes, NMF actually gives you a biclustering model, grouping patients (columns) and genes (rows), which have expression pattern that are characteristic to each patient group. The cluster memberships are returned by
predict(res)
, based on the most contributing basis component in each patient. The contributions of each basis component is given the matrix H ( X = W * H) by
coef(res)
You can see the contribution patterns with
# default is to scale contributions so sum up to one
coefmap(res)
# consensus matrix
consensusmap(res)
Thanks for the response. I tried to estimate the rank to get the best possibility. you can see the results for ranks 3:6 in the figure. Clustering data into 3 clusters has the highest cophenetic but rank 3 also has the highest dispersion. Do you think I can support my idea of clustering the samples into 3 in this situation?
Higher the dispersion the better, as it measures how much distinct the consensus clusters are. So these two measures are actually consistent.
Hi,
I have a matrix(mat) of gene expression data with patients(417 patients) as columns and genes (180 genes) as rows. I want to classify the patients(not the gene expression pattern) based on their gene expressions into four classes. Using following command: res <- nmf(mat, 4, nrun = 200, seed = 123456)
Do you think it is a correct way of classifying the patients? Using aheatmap command I can see that there exists four separate basises. I do not know how to get the barcodes of patients for each basis? I used the "basisnames" command: basisnames(res) but I got NULL.
How can I know which patients are grouped together? Thanks for the help.