Closed thistleknot closed 1 year ago
Thanks for your question.
If you take a look into the source code: https://github.com/DingWB/PyComplexHeatmap/blob/994839b40fe0507ab0bdedb77d96d9a1c2804b66/PyComplexHeatmap/clustermap.py#L2119
You will find that you can call attributes row_order
and col_order
to get the clustering. Let's take the following example:
import os,sys
import PyComplexHeatmap
from PyComplexHeatmap import *
%matplotlib inline
import matplotlib.pylab as plt
import pickle
plt.rcParams['figure.dpi'] = 100
plt.rcParams['savefig.dpi']=300
#Generate an example dataset (random)
df = pd.DataFrame(['AAAA1'] * 5 + ['BBBBB2'] * 5, columns=['AB'])
df['CD'] = ['C'] * 3 + ['D'] * 3 + ['G'] * 4
df['EF'] = ['E'] * 6 + ['F'] * 2 + ['H'] * 2
df['F'] = np.random.normal(0, 1, 10)
df.index = ['sample' + str(i) for i in range(1, df.shape[0] + 1)]
df_box = pd.DataFrame(np.random.randn(10, 4), columns=['Gene' + str(i) for i in range(1, 5)])
df_box.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_bar = pd.DataFrame(np.random.uniform(0, 10, (10, 2)), columns=['TMB1', 'TMB2'])
df_bar.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_scatter = pd.DataFrame(np.random.uniform(0, 10, 10), columns=['Scatter'])
df_scatter.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_heatmap = pd.DataFrame(np.random.randn(50, 10), columns=['sample' + str(i) for i in range(1, 11)])
df_heatmap.index = ["Fea" + str(i) for i in range(1, df_heatmap.shape[0] + 1)]
df_heatmap.iloc[1, 2] = np.nan
plt.figure(figsize=(6, 12))
row_ha = HeatmapAnnotation(label=anno_label(df.AB, merge=True,rotation=15),
AB=anno_simple(df.AB,add_text=True),axis=1,
CD=anno_simple(df.CD,add_text=True),
Exp=anno_boxplot(df_box, cmap='turbo'),
Scatter=anno_scatterplot(df_scatter), TMB_bar=anno_barplot(df_bar),
)
cm = ClusterMapPlotter(data=df_heatmap, top_annotation=row_ha, col_split=2, row_split=3, col_split_gap=0.5,
row_split_gap=1,label='values',row_dendrogram=True,show_rownames=True,show_colnames=True,
tree_kws={'row_cmap': 'Dark2'})
plt.show()
If you want to export the derived row clusters, you can call cm.row_order
print(cm.row_order)
# if row_split = None, you can also use the following method to get the row orders:
#print(cm.dendrogram_row.dendrogram['ivl'])
[['Fea10', 'Fea35', 'Fea21', 'Fea45', 'Fea11', 'Fea5', 'Fea23', 'Fea2', 'Fea19', 'Fea29', 'Fea3', 'Fea12', 'Fea4', 'Fea14', 'Fea32', 'Fea16', 'Fea40', 'Fea30', 'Fea28'], ['Fea22', 'Fea49', 'Fea20', 'Fea48', 'Fea26', 'Fea27', 'Fea44', 'Fea38', 'Fea37', 'Fea50', 'Fea24', 'Fea17'], ['Fea47', 'Fea1', 'Fea46', 'Fea42', 'Fea33', 'Fea43', 'Fea31', 'Fea9', 'Fea36', 'Fea15', 'Fea34', 'Fea8', 'Fea6', 'Fea7', 'Fea13', 'Fea25', 'Fea41', 'Fea18', 'Fea39']]
cm.row_order
is a list, and each element in cm.row_order
is also a list.
I hope this answer is helpful to you. Thanks again for your question.
I see examples to show a cluster graph, but what if we need the derived clusters? What object should be called?