DingWB / PyComplexHeatmap

PyComplexHeatmap: A Python package to plot complex heatmap (clustermap)
https://dingwb.github.io/PyComplexHeatmap/
MIT License
249 stars 28 forks source link

Displaying images in place of text labels for heatmaps #64

Open dakomura opened 5 months ago

dakomura commented 5 months ago

I'm a pathology image analysis researcher and use your library for creating heatmaps.

I would appreciate if you could add a feature allowing images to be displayed as labels for each feature (row) in the heatmap, instead of text labels. This enhancement would greatly aid in visual data interpretation in pathology image analysis.

Thank you for considering my suggestion.

DingWB commented 5 months ago

Could you please give me an example figure? @dakomura

dakomura commented 5 months ago

@DingWB example Something like this.

Thank you.

DingWB commented 4 months ago

Would you happen to have any interest in implementing this function? I can give a brief introduction about how to implement it. Basically, you can write a new annotation function inherited from AnnotationBase: https://github.com/DingWB/PyComplexHeatmap/blob/f4b3edd246e283620be1afbe30cff515d6cee662/PyComplexHeatmap/annotations.py#L20 such as anno_boxplot: https://github.com/DingWB/PyComplexHeatmap/blob/f4b3edd246e283620be1afbe30cff515d6cee662/PyComplexHeatmap/annotations.py#L706

dakomura commented 4 months ago

Yes. I'm interested in the implementation. I'm not sure if I could, but I'll try.

DingWB commented 4 months ago

Thanks @dakomura It's not difficult at all, just write a new class inherited from AnnotationBase, and display the image in the new axes. I suggest using a series of image paths as input, instead of RGB image data itself. I would be happy to help if you need my assistance.

dakomura commented 4 months ago

I wrote a new class anno_img to annotate images. Images from multiple samples were concatenated (img = np.concatenate([np.array(Image.open(imgfile)) for imgfile in imgfiles], axis=axis)).

class anno_img(AnnotationBase):
    """
        Annotate images.
    """
    def __init__(
        self,
        df=None,
        cmap="auto",
        colors=None,
        text_kws=None,
        height=None,
        legend=True,
        legend_kws=None,
        **plot_kws
    ):
        self.text_kws = text_kws if not text_kws is None else {}
        self.plot_kws = plot_kws
        super().__init__(
            df=df,
            cmap=cmap,
            colors=colors,
            height=height,
            legend=legend,
            legend_kws=legend_kws,
            **plot_kws
        )

    def _calculate_colors(self):  # add self.color_dict (each col is a dict)
        self.colors = None

    def _check_cmap(self, cmap):
        self.cmap = None

    def plot(
        self, ax=None, axis=1, subplot_spec=None, label_kws={}, ticklabels_kws={} 
    ):  # add self.gs,self.fig,self.ax,self.axes        
        if ax is None:
            ax = plt.gca()
        imgfiles = list(self.plot_data.iloc[:,0])
        img = np.concatenate([np.array(Image.open(imgfile)) for imgfile in imgfiles], axis=axis)
        print(img.shape)
        ax.imshow(img, aspect='auto')
        ax.set_axis_off()
        self.ax = ax
        self.fig = self.ax.figure
        return self.ax

However, when I tested it with a following code, the results were unexpected.

df_heatmap = pd.DataFrame(np.random.randn(30, 10), columns=['sample' + str(i) for i in range(1, 11)])
df_heatmap.index = ["Fea" + str(i) for i in range(1, df_heatmap.shape[0] + 1)]
df_heatmap.iloc[1, 2] = np.nan

plt.figure(figsize=(10, 16))

df_img_col = pd.DataFrame([f"{i:02}.jpg" for i in range(10)], columns=['Image'])
df_img_col.index = ['sample' + str(i) for i in range(1, 11)]
df_img_row = pd.DataFrame([f"{i:02}.jpg" for i in range(30)], columns=['Image'])
df_img_row.index = ["Fea" + str(i) for i in range(1, df_heatmap.shape[0] + 1)]

col_ha = HeatmapAnnotation(Image=anno_img(df_img_col, height=5), axis=1)
row_ha = HeatmapAnnotation(Image=anno_img(df_img_row, height=20), axis=0)

cm = ClusterMapPlotter(data=df_heatmap, top_annotation=col_ha, right_annotation=row_ha, col_split=2, row_split=3, col_split_gap=0.5,
                     row_split_gap=1,label='values',row_dendrogram=True,show_rownames=False,show_colnames=False,
                     tree_kws={'row_cmap': 'Dark2'},cmap='Spectral_r',
                       legend_gap=5,legend_hpad=2,legend_vpad=5)
plt.show()
heatmap

Apparently, the images were only partially displayed. I attempted to adjust the display by removing aspect='auto' in ax.imshow, which resulted in different, but still unsatisfactory, visualization outcomes.

スクリーンショット 2024-02-19 8 52 47

One of the input images is like this. 1

Could you assist me in resolving this issue?

Thank you.

DingWB commented 4 months ago

Hello, @dakomura . After adding a parameter extent, I solved this issue. Please check out the developmental version on github (pip install git+https://github.com/DingWB/PyComplexHeatmap.git).

https://github.com/DingWB/PyComplexHeatmap/blob/d05a08189b5a6d63fbe5ff394d966f8310faa427/PyComplexHeatmap/annotations.py#L1080

For example: example 1 make dataset:

df = pd.DataFrame(['AAAA1'] * 5 + ['BBBBB2'] * 5, columns=['AB'])
df['CD'] = ['C'] * 3 + ['D'] * 3 + ['G'] * 4
df['F'] = np.random.normal(0, 1, 10)
df.index = ['sample' + str(i) for i in range(1, df.shape[0] + 1)]
df_box = pd.DataFrame(np.random.randn(10, 4), columns=['Gene' + str(i) for i in range(1, 5)])
df_box.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_bar = pd.DataFrame(np.random.uniform(0, 10, (10, 2)), columns=['TMB1', 'TMB2'])
df_bar.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_scatter = pd.DataFrame(np.random.uniform(0, 10, 10), columns=['Scatter'])
df_scatter.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_bar1 = pd.DataFrame(np.random.uniform(0, 10, (10, 2)), columns=['T1-A', 'T1-B'])
df_bar1.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_bar2 = pd.DataFrame(np.random.uniform(0, 10, (10, 2)), columns=['T2-A', 'T2-B'])
df_bar2.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_bar3 = pd.DataFrame(np.random.uniform(0, 10, (10, 2)), columns=['T3-A', 'T3-B'])
df_bar3.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_bar3.iloc[5,0]=np.nan
df_bar4 = pd.DataFrame(np.random.uniform(0, 10, (10, 1)), columns=['T4'])
df_bar4.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_bar4.iloc[7,0]=np.nan
df_img = pd.DataFrame([f"1 copy {i}.jpeg" for i in range(1,11)], columns=['path'])
df_img.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
print(df)
print(df_box.head())
print(df_scatter)
print(df_img)

plot

plt.figure(figsize=(20, 8))
col_ha = HeatmapAnnotation(
            label=anno_label(df.AB, merge=True,rotation=15),
            AB=anno_simple(df.AB,add_text=True,legend=True), axis=1,
            CD=anno_simple(df.CD, add_text=True,legend=True,text_kws={'color':'black'}),
            Exp=anno_boxplot(df_box, cmap='turbo',legend=True),
            Scatter=anno_scatterplot(df_scatter), 
            Bar1=anno_barplot(df_bar1,legend=True,cmap='Dark2'),
            Bar4=anno_barplot(df_bar4,legend=True,cmap='turbo'),
            Img=anno_img(df_img.path),
            plot=True,legend=True,legend_gap=5,hgap=0.5)
col_ha.show_ticklabels(df.index.tolist(),fontdict={'color':'blue'},rotation=-30)
plt.show()
image

Example 2: a simple example:

plt.figure(figsize=(20,2))
ann=anno_img(df_img.path)
ann.plot()
image

You can use these two examples to test the function during your development. There are lots of things that need to be done, for example: (1). add some space in the boundaries of two columns/rows? (2). Test whether the row_split_gap or col_split_gap shows as expected when row_split or col_split are specified. (3) Add plot_kws, such as camp, norm, vmax, vmin that will be passed to ax.imshow (4). Other useful parameters or functions.

Could you please keep working and implementing the above features?

dakomura commented 4 months ago

Hi @DingWB ,

Thank you so much for your help!

I'm happy to continue working on the implementation.

DingWB commented 4 months ago

Thanks. You can fork this repository first, and make a pull request after implementing this function.