higlass / higlass-python

Python bindings to and Jupyter Notebook+Lab integration for the HiGlass viewer
http://docs-python.higlass.io/
MIT License
52 stars 12 forks source link

Difficulty Visualizing Pandas DataFrame with HiGlass Python API in Jupyter Notebook #148

Open michaelfekadu opened 5 months ago

michaelfekadu commented 5 months ago

I've been struggling to find a clear explanation on how to visualize a Pandas DataFrame using the HiGlass Python API within a Jupyter notebook. It seems like I need to define a LocalTileset first, but I haven't had success with that approach. Additionally, the 'Numpy Matrix' section in the 'Getting Started' guide didn't yield results for me either. Can someone please provide guidance on how to visualize a Pandas DataFrame with the HiGlass Python API in a Jupyter notebook? Any help would be greatly appreciated! Thank you!

manzt commented 4 months ago

Can someone please provide guidance on how to visualize a Pandas DataFrame with the HiGlass Python API in a Jupyter notebook?

There is no official support for loading an in-memory data frame. This is possible via a Tileset, I believe sending bedlike tiles. Can you share more about the data you are trying to load?

Additionally, the 'Numpy Matrix' section in the 'Getting Started' guide didn't yield results for me either.

Fixed in #150

michaelfekadu commented 4 months ago

Thank u so much for your reply @manzt! I am trying visualize the following in memory data frame:

             T4d_311.0  T5d_311.0  T5a_311.0  T4b_311.0  T4a_311.0  T5c_311.0  
T4d_311.0    0.000000   0.332576   0.789181   0.888197   0.809307   0.726139   
T5d_311.0    0.332576   0.000000   0.597985   1.000000   0.818182   0.738883   
T5a_311.0    0.789181   0.597985   0.000000   0.882149   0.497481   0.711325   
T4b_311.0    0.888197   1.000000   0.882149   0.000000   0.680199   0.897938   
T4a_311.0    0.809307   0.818182   0.497481   0.680199   0.000000   0.912961   
T5c_311.0    0.726139   0.738883   0.711325   0.897938   0.912961   0.000000   

Would you mind providing a brief template code demonstrating how to achieve that using Tilesets and bedlike tiles? Thank you!

manzt commented 4 months ago

This the goal to visualize a 2D matrix like a heatmap?

michaelfekadu commented 4 months ago

yes exactly!

manzt commented 4 months ago

There are several tools for visualizing such matrices interactively, beyond higlass-python.

First you need to extract the data as a numpy array:

import numpy as np
import pandas as pd
import io

txt = io.StringIO('''
T4d_311.0 T5d_311.0 T5a_311.0 T4b_311.0 T4a_311.0 T5c_311.0
T4d_311.0 0.000000 0.332576 0.789181 0.888197 0.809307 0.726139
T5d_311.0 0.332576 0.000000 0.597985 1.000000 0.818182 0.738883
T5a_311.0 0.789181 0.597985 0.000000 0.882149 0.497481 0.711325
T4b_311.0 0.888197 1.000000 0.882149 0.000000 0.680199 0.897938
T4a_311.0 0.809307 0.818182 0.497481 0.680199 0.000000 0.912961
T5c_311.0 0.726139 0.738883 0.711325 0.897938 0.912961 0.000000
''')

df = pd.read_csv(txt, sep=" ")
data = df.values # extract the numpy array

Vizualizing with higlass-python:

import higlass as hg
from  clodius.tiles import npmatrix
from higlass.tilesets import LocalTileset

ts = hg.server.add(
    LocalTileset(
        info=lambda: npmatrix.tileset_info(data),
        tiles=lambda tids: npmatrix.tiles_wrapper(data, tids),
        uid="example-npmatrix",
        datatype="matrix"
    )
)

hg.view(
    hg.track("top-axis"),
    hg.track("left-axis"),
    ts.track("heatmap", height=250).opts(valueScaleMax=0.5),
)

Visualizing with vizarr:

import zarr
import vizarr

viewer = vizarr.Viewer()
viewer.add_image(zarr.array(data))
viewer

Visualizing with napari:

import napari

viewer = napari.view_image(data)
napari.run()