pinellolab / STREAM

STREAM: Single-cell Trajectories Reconstruction, Exploration And Mapping of single-cell data
http://stream.pinellolab.org
GNU Affero General Public License v3.0
173 stars 48 forks source link

When visualizing the Seurat Clusters I am not getting the correct output even though I have converted the seurat clusters pandas column to string. #82

Closed smk5g5 closed 4 years ago

smk5g5 commented 4 years ago

Hi,

As per our conversation yesterday I was able to use the multi-label list as described in the issue here https://github.com/pinellolab/STREAM/issues/80. For the seurat clusters which look as shown in the figure below.

Screen Shot 2020-06-30 at 9 33 16 AM

At first STREAM was interpreting the Seurat_clusters cluster column as integer and was plotting a continuous distribution over the clusters image

so I changed it to str and now it is giving me a plot which does not have all cluster info.

image

Is there a workaround to getting it right?

Thanks!

huidongchen commented 4 years ago

This is a notorious issue of seaborn package, which is used internally by STREAM. Somehow it always interprets integers as numerical type regardless of its real data type.

An easy workaround is to add a prefix to your cluster labels, e.g. adata.obs['Seruat_clusters'] = 'cluster_'+adata.obs['Seruat_clusters']

smk5g5 commented 4 years ago

test_clust

Is a more divergent color scheme possible here?

huidongchen commented 4 years ago

This is not what it's supposed to look like. Can you try to add a prefix to your cluster label as i suggested above?

smk5g5 commented 4 years ago

That throws an error. adata.obs['Seruatclusters'] = 'cluster'+adata.obs['Seruat_clusters']

KeyError Traceback (most recent call last) ~/opt/anaconda3/envs/env_stream/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 2645 try: -> 2646 return self._engine.get_loc(key) 2647 except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'Seruat_clusters'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last)

in ----> 1 adata.obs['Seruat_clusters'] = 'cluster_'+adata.obs['Seruat_clusters'] ~/opt/anaconda3/envs/env_stream/lib/python3.7/site-packages/pandas/core/frame.py in __getitem__(self, key) 2798 if self.columns.nlevels > 1: 2799 return self._getitem_multilevel(key) -> 2800 indexer = self.columns.get_loc(key) 2801 if is_integer(indexer): 2802 indexer = [indexer] ~/opt/anaconda3/envs/env_stream/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 2646 return self._engine.get_loc(key) 2647 except KeyError: -> 2648 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2649 indexer = self.get_indexer([key], method=method, tolerance=tolerance) 2650 if indexer.ndim > 1 or indexer.size > 1: pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 'Seruat_clusters'
huidongchen commented 4 years ago

Sorry i misspelled it. adata.obs['Seurat_clusters'] = 'cluster_'+adata.obs['Seurat_clusters'] should work. (Please make sure the data type of 'Seurat_clusters' is str first. )

smk5g5 commented 4 years ago

Thanks! that worked great! test_clust