MaayanLab / clustergrammer-py

A python module for the clustergrammer matrix visualization project that creates the JSON object for the front-end JavaScript portion of Clustergrammer.
MIT License
12 stars 15 forks source link

Recursion error in dendrogram #10

Closed CGUTA closed 5 years ago

CGUTA commented 5 years ago

Hello, I found out about clustergrammer through the jupytercon and I think is very cool.

I tried making the cluster from a dataset (17K genes 10K samples), but during the dendrogram construction an error occurs:

...scipy/cluster/hierarchy.py", line 2534, in _append_singleton_leaf_node RecursionError: maximum recursion depth exceeded while getting the str of an object

Is there a way to deactivate the dendrogram to bypas this ?

cornhundred commented 5 years ago

We're glad you like Clustergrammer. For a dataset of that size I would recommend trying the new in-development Clustergrammer-GL front-end library and/or the Clustergrammer2 Jupyter widget - these are built using WebGL which allows visualization of larger datasets. However, I would also recommend reducing the dimensionality (try reducing to ~1,000) and randomly subsampling or downsampling your samples (try reducing to 1,000 - 5,000). This will probably ifx the maximum recursion depth exceeded problem and you will be able to visualize the reduced size dataset.

You can try out Clustergrammer2 on MyBinder on the CCLE dataset (18,000 genes and 1,000 cell-lines) to get an idea of how to visualize a similar dataset:

badge

Also, see clustergrammer2-examples for more examples.

CGUTA commented 5 years ago

Thank you for your response,

I tried generating a JSON from a 10X10 mtx using clustergrammer-py and then plug it into the index.html of the clustergrammer-gl repo. But nothing renders (is the JSON from clustergrammer the same format for the clustergrammer-gl?). I checked the console and I get

clustergrammer-gl.js:39333 Uncaught TypeError: tmp_cat.indexOf is not a function
    at clustergrammer-gl.js:39333
    at Function.Clustergrammer2../node_modules/underscore/underscore.js._.each._.forEach (clustergrammer-gl.js:35794)
    at generate_cat_array (clustergrammer-gl.js:39323)
    at generate_cat_params (clustergrammer-gl.js:44770)
    at initialize_params (clustergrammer-gl.js:45146)
    at run_viz (clustergrammer-gl.js:42112)
    at clustergrammer_gl (clustergrammer-gl.js:43142)
    at (index):72
    at d3.js:2010
    at Object.<anonymous> (d3.js:1995)

the mtx that i used to generate the JSON is

        S1  S2  S3  S4  S5  S6  S7  S8  S9  S10 S11 S12
        3   3   3   3   3   3   4   3   3   3   3   3
Z   3   0   20  20  20  40  60  60  60  100 120 120 120
A   1   20  0   20  20  60  80  80  80  120 140 140 140
B   1   20  20  0   20  60  80  80  80  120 140 140 140
C   1   20  20  20  0   60  80  80  80  120 140 140 140
D   2   40  60  60  60  0   20  20  20  60  80  80  80
E   2   60  80  80  80  20  0   20  20  40  60  60  60
F   2   60  80  80  80  20  20  0   20  60  80  80  80
G   2   60  80  80  80  20  20  20  0   60  80  80  80
H   1   100 120 120 120 60  40  60  60  0   20  20  20
I   1   120 140 140 140 80  60  80  80  20  0   20  20
J   1   120 140 140 140 80  60  80  80  20  20  0   20
K   1   120 140 140 140 80  60  80  80  20  20  20  0
CGUTA commented 5 years ago

It worked!, I now generated the JSON directly for the clustergrammer GL. What I learned from the current build so far.

The error before was probably that the JSON from clustergrammer-py only works when it generates categories together with the dendrogram that are compatible with your current example mult_view.json in data.

cornhundred commented 5 years ago

Great, the cat_index is the order of the row/col when ordering by category (e.g. when double-clicking on the category title). For Clustergrammer-GL this is only currently working for column reordering, but soon I'll update to have row category reordering also.