aertslab / pycisTopic

pycisTopic is a Python module to simultaneously identify cell states and cis-regulatory topics from single cell epigenomics data.
Other
58 stars 12 forks source link

cell_topic_heatmap color_dictionary can't be None #161

Closed camiel-m closed 2 months ago

camiel-m commented 2 months ago

129 I ran into this same issue and added some code to check if the color_dict exists and if not to generate a simple one using matplotlib.colormaps

SeppeDeWinter commented 2 months ago

Thanks for the PR!

Looking again at the code I noticed that the case where variables is None is also not handled correctly (hence the extra changes).

When a variable is not present in the color_dictionary a random color was already assigned. If you prefer this to be colors from tab20, or another color map from matplotlib, feel free to change this.

@camiel-m could you check wether the code is working?

Thanks,

Seppe

camiel-m commented 2 months ago

color_dict = None is now working fine, but if you don't specify any variables you get the following error:

---------------------------------------------------------------------------

KeyError                                  Traceback (most recent call last)
Cell In[13], line 2
      1 key = 'pycisTopic_leiden_10_0.3'
----> 2 cell_topic_heatmap(
      3     cistopic_obj,
      4     scale = False,
      5     legend_loc_x = 1.0,
      6     legend_loc_y = -1.2,
      7     legend_dist_y = -1,
      8     figsize = (10, 10)
      9 )

File #/git_repos/pycisTopic/src/pycisTopic/clust_vis.py:933, in cell_topic_heatmap(cistopic_obj, variables, remove_nan, scale, cluster_topics, color_dictionary, seed, legend_loc_x, legend_loc_y, legend_dist_y, figsize, selected_topics, selected_cells, harmony, save)
    926 if scale:
    927     cell_topic = pd.DataFrame(
    928         sklearn.preprocessing.StandardScaler().fit_transform(cell_topic),
    929         index=cell_topic.index.to_list(),
    930         columns=cell_topic.columns,
    931     )
--> 933 if remove_nan and (sum(cell_data[variables].isnull().sum()) > 0):
    934     cell_data = cell_data[variables].dropna()
    935     cell_topic = cell_topic.loc[:, cell_data.index.tolist()]

maybe a simple change like this would be enough?

    if variables:
        if remove_nan and (sum(cell_data[variables].isnull().sum()) > 0):
            cell_data = cell_data[variables].dropna()
            cell_topic = cell_topic.loc[:, cell_data.index.tolist()]
SeppeDeWinter commented 2 months ago

Ah yes, right!

Should be fixed now (I hope).

camiel-m commented 2 months ago

It doesn't throw an error, but the plot does look quite different from default behavior. The axis are flipped and the cells are not sorted. Is this the intended behavior or should the cells maybe sorted based on similarity or by highest topic score? Then again I don't think a lot of people will use this without cluster labels. download

SeppeDeWinter commented 2 months ago

hmm yes. does not look super informative.

How about when you set cluster_topics to True (it's False by default)

camiel-m commented 2 months ago

It's a lot more interpretable that way. download

SeppeDeWinter commented 2 months ago

Great!!