aertslab / scenicplus

SCENIC+ is a python package to build gene regulatory networks (GRNs) using combined or separate single-cell gene expression (scRNA-seq) and single-cell chromatin accessibility (scATAC-seq) data.
Other
165 stars 27 forks source link

[BUG] Error in plotting heatmap #233

Open josephineyates opened 9 months ago

josephineyates commented 9 months ago

Dear all, Thank you for this awesome package! I found a potential minor bug in the code. In the heatmap_dotplot function, when changing the sort_byargument I get this error:

KeyError                                  Traceback (most recent call last)
File ~/mambaforge/envs/scenicenv/lib/python3.10/site-packages/pandas/core/indexes/base.py:3800, in Index.get_loc(self, key, method, tolerance)
   3799 try:
-> 3800     return self._engine.get_loc(casted_key)
   3801 except KeyError as err:

File ~/mambaforge/envs/scenicenv/lib/python3.10/site-packages/pandas/_libs/index.pyx:138, in pandas._libs.index.IndexEngine.get_loc()

File ~/mambaforge/envs/scenicenv/lib/python3.10/site-packages/pandas/_libs/index.pyx:165, in pandas._libs.index.IndexEngine.get_loc()

File pandas/_libs/hashtable_class_helper.pxi:5745, in pandas._libs.hashtable.PyObjectHashTable.get_item()

File pandas/_libs/hashtable_class_helper.pxi:5753, in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'color_val'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
Cell In[100], line 2
      1 from scenicplus.plotting.dotplot import *
----> 2 heatmap_dotplot(
      3         scplus_obj = scplus_obj,
      4         size_matrix = scplus_obj.uns['RSS']['ACC_highlevel_wcancer_gene_based'],
      5         color_matrix = scplus_obj.to_df('EXP'),
      6         scale_size_matrix = True,
      7         scale_color_matrix = True,
      8         group_variable = 'ACC_highlevel_wcancer',
      9         subset_eRegulons = scplus_obj.uns['selected_eRegulons']['Gene_based'],
     10         figsize = (15, 10),
     11         orientation = 'vertical',
     12         split_repressor_activator=False, 
     13     sort_by="size_val")

File /cluster/work/boeva/jyates/EAC_singlecell/scenicplus/src/scenicplus/plotting/dotplot.py:198, in heatmap_dotplot(scplus_obj, size_matrix, color_matrix, scale_size_matrix, scale_color_matrix, group_variable, subset_eRegulons, sort_by, index_order, save, figsize, split_repressor_activator, orientation)
    196         plotting_df['index'] = pd.Categorical(plotting_df['index'], categories = index_order)
    197 #sort values
--> 198 tmp = plotting_df[['index', 'eRegulon_name', sort_by]
    199     ].pivot_table(index = 'index', columns = 'eRegulon_name'
    200     ).fillna(0)['color_val']
    201 if index_order is not None:
    202     tmp = tmp.loc[index_order]

File ~/mambaforge/envs/scenicenv/lib/python3.10/site-packages/pandas/core/frame.py:3804, in DataFrame.__getitem__(self, key)
   3802 if is_single_key:
   3803     if self.columns.nlevels > 1:
-> 3804         return self._getitem_multilevel(key)
   3805     indexer = self.columns.get_loc(key)
   3806     if is_integer(indexer):

File ~/mambaforge/envs/scenicenv/lib/python3.10/site-packages/pandas/core/frame.py:3855, in DataFrame._getitem_multilevel(self, key)
   3853 def _getitem_multilevel(self, key):
   3854     # self.columns is a MultiIndex
-> 3855     loc = self.columns.get_loc(key)
   3856     if isinstance(loc, (slice, np.ndarray)):
   3857         new_columns = self.columns[loc]

File ~/mambaforge/envs/scenicenv/lib/python3.10/site-packages/pandas/core/indexes/multi.py:2915, in MultiIndex.get_loc(self, key, method)
   2912     return mask
   2914 if not isinstance(key, tuple):
-> 2915     loc = self._get_level_indexer(key, level=0)
   2916     return _maybe_to_slice(loc)
   2918 keylen = len(key)

File ~/mambaforge/envs/scenicenv/lib/python3.10/site-packages/pandas/core/indexes/multi.py:3262, in MultiIndex._get_level_indexer(self, key, level, indexer)
   3258         return slice(i, j, step)
   3260 else:
-> 3262     idx = self._get_loc_single_level_index(level_index, key)
   3264     if level > 0 or self._lexsort_depth == 0:
   3265         # Desired level is not sorted
   3266         if isinstance(idx, slice):
   3267             # test_get_loc_partial_timestamp_multiindex

File ~/mambaforge/envs/scenicenv/lib/python3.10/site-packages/pandas/core/indexes/multi.py:2848, in MultiIndex._get_loc_single_level_index(self, level_index, key)
   2846     return -1
   2847 else:
-> 2848     return level_index.get_loc(key)

File ~/mambaforge/envs/scenicenv/lib/python3.10/site-packages/pandas/core/indexes/base.py:3802, in Index.get_loc(self, key, method, tolerance)
   3800     return self._engine.get_loc(casted_key)
   3801 except KeyError as err:
-> 3802     raise KeyError(key) from err
   3803 except TypeError:
   3804     # If we have a listlike key, _check_indexing_error will raise
   3805     #  InvalidIndexError. Otherwise we fall through and re-raise
   3806     #  the TypeError.
   3807     self._check_indexing_error(key)

KeyError: 'color_val'

Indeed, in the source code, this is where it throws the error:

tmp = plotting_df[['index', 'eRegulon_name', sort_by]
        ].pivot_table(index = 'index', columns = 'eRegulon_name'
        ).fillna(0)['color_val']

I see indeed that "color_val" is hardcoded here. If I'm not mistaken, I believe it should be sort_by as well.

The function runs without issue if the sort_by argument is set to "color_val".

Thank you again for your great work!

SeppeDeWinter commented 9 months ago

Hi @josephineyates

Thank you for the bug report. This is indeed a mistake in the code.

I'll try to fix it soon.

Best,

Seppe