aertslab / scenicplus

SCENIC+ is a python package to build gene regulatory networks (GRNs) using combined or separate single-cell gene expression (scRNA-seq) and single-cell chromatin accessibility (scATAC-seq) data.
Other
162 stars 27 forks source link

Problem with GRN plot #393

Open SteveTur opened 1 month ago

SteveTur commented 1 month ago

Hello,

It seems that the function to plot the GRN on python doesn't work anymore since the recent updates:

from scenicplus.networks import create_nx_tables, create_nx_graph, plot_networkx, export_to_cytoscape G, pos, edge_tables, node_tables = create_nx_graph( nx_tables, use_edge_tables = ['TF2R','R2G'], color_edge_by = { 'TF2R': {'variable' : 'TF', 'category_color' : category_color}, 'R2G': {'variable' : 'R2G_rho', 'continuous_color' : 'viridis', 'v_min': -1, 'v_max': 1} }, transparency_edge_by = { 'R2G': {'variable' : 'R2G_importance', 'min_alpha': 0.1, 'v_min': 0} }, width_edge_by = { 'R2G': {'variable' : 'R2G_importance', 'max_size' : 1.5, 'min_size' : 1} }, color_node_by = { 'TF': {'variable': 'TF', 'category_color' : category_color}, 'Gene': {'variable': 'GEX_celltype_Log2FC_6', 'continuous_color' : 'bwr'}, 'Region': {'variable': 'GEX_celltype_Log2FC_6', 'continuous_color' : 'viridis'} }, transparency_node_by = { 'Region': {'variable' : 'GEX_celltype_Log2FC_6', 'min_alpha': 0.1}, 'Gene': {'variable' : 'GEX_celltype_Log2FC_6', 'min_alpha': 0.1} }, size_node_by = { 'TF': {'variable': 'fixed_size', 'fixed_size': 30}, 'Gene': {'variable': 'fixed_size', 'fixed_size': 15}, 'Region': {'variable': 'fixed_size', 'fixed_size': 10} }, shape_node_by = { 'TF': {'variable': 'fixed_shape', 'fixed_shape': 'ellipse'}, 'Gene': {'variable': 'fixed_shape', 'fixed_shape': 'ellipse'}, 'Region': {'variable': 'fixed_shape', 'fixed_shape': 'diamond'} }, label_size_by = { 'TF': {'variable': 'fixed_label_size', 'fixed_label_size': 10.0}, 'Gene': {'variable': 'fixed_label_size', 'fixed_label_size': 10.0}, 'Region': {'variable': 'fixed_label_size', 'fixed_label_size': 0.0} }, layout='kamada_kawai_layout', scale_position_by=250 )

plt.figure(figsize=(50,50)) plot_networkx(G, pos)

Only blank output are coming out for some reason. We never got any problem before the recent changes. What should we do?

Thank you for your help,

Best,

Steven

my0916 commented 1 month ago

Hi,

I'm also suffering from a similar problem... https://github.com/aertslab/scenicplus/discussions/378#discussioncomment-9314238

I also tried to export to Cytoscape with the export_to_cytoscape function, but the exported cys file cannot be loaded with the message 'Cannot find the cysession.xml file'

Thank you for your help.

Best, Masa

yiyelinfeng commented 1 month ago

I have same problem, please help!

SeppeDeWinter commented 1 month ago

Hi @SteveTur, @my0916 and @yiyelinfeng

I am aware of the issue. I have added it to my to do list and will try to fix it whenever I have some time.

All the best,

Seppe

yiyelinfeng commented 1 month ago

Hi@ SeppeDeWinter,

my issue only just the .cys file cannot be loaded in cytoscape with the message 'Cannot find the cysession.xml file' when file -> import -> Network from file ... after import the SNENIC+ network layout. Because I want to check and change the network graph using Cytoscape. but can't load it.

thanks,

Best, Lin

DmitriiSeverinov commented 1 month ago

Hi @SeppeDeWinter , @yiyelinfeng @my0916 , @SteveTur ,

I had similar issue with blank output, what I found is that if I comment the following line in the create_nx_tables function, it works: subset_eRegulons = [x + '_[^a-zA-Z0-9]' for x in subset_eRegulons]

So, here is my code how I did it:

from pycisTopic.diff_features import find_highly_variable_features

scplus_obj.uns['direct_e_regulon_metadata_filtered'] = scplus_mdata.uns['direct_e_regulon_metadata_filtered']

hvr = find_highly_variable_features(scplus_obj.to_df('ACC').loc[list(set(scplus_obj.uns['direct_e_regulon_metadata_filtered']['Region']))], n_top_features=3000, plot = False)
hvg = find_highly_variable_features(scplus_obj.to_df('EXP')[list(set(scplus_obj.uns['direct_e_regulon_metadata_filtered']['Gene']))].T, n_top_features=3000, plot = False)

from scenicplus.networks import create_nx_tables, create_nx_graph, plot_networkx, export_to_cytoscape

nx_tables = list()

def _format_df_nx(df, key, var):
    """
    A helper function to format differential test results
    """
    df.index = df['names']
    df = pd.DataFrame(df['logfoldchanges'])
    df.columns = [var+'_Log2FC_'+key]
    df.index.name = None
    return df

def _get_log2fc_nx(scplus_obj: 'SCENICPLUS',
                  variable,
                  features,
                  contrast: Optional[str] = 'gene'
                  ):
    """
    A helper function to derive log2fc changes
    """
    if contrast == 'gene':
        adata = anndata.AnnData(X=scplus_obj.X_EXP, obs=pd.DataFrame(
            index=scplus_obj.cell_names), var=pd.DataFrame(index=scplus_obj.gene_names))
    if contrast == 'region':
        adata = anndata.AnnData(X=scplus_obj.X_ACC.T, obs=pd.DataFrame(
            index=scplus_obj.cell_names), var=pd.DataFrame(index=scplus_obj.region_names))
    adata.obs = pd.DataFrame(scplus_obj.metadata_cell[variable])
    sc.pp.normalize_total(adata, target_sum=1e4)
    sc.pp.log1p(adata)
    adata = adata[:, features]
    sc.tl.rank_genes_groups(
        adata, variable, method='wilcoxon', corr_method='bonferroni')
    groups = adata.uns['rank_genes_groups']['names'].dtype.names
    diff_list = [_format_df_nx(sc.get.rank_genes_groups_df(
        adata, group=group), group, variable) for group in groups]
    return pd.concat(diff_list, axis=1)

def create_nx_tables(scplus_obj: 'SCENICPLUS',
                     eRegulon_metadata_key: str ='eRegulon_metadata',
                     subset_eRegulons: List = None,
                     subset_regions: List = None,
                     subset_genes: List = None,
                     add_differential_gene_expression: bool = False,
                     add_differential_region_accessibility: bool = False,
                     differential_variable: List =[]):
    """
    A function to format eRegulon data into tables for plotting eGRNs.

    Parameters
    ---------
    scplus_obj: SCENICPLUS
        A SCENICPLUS object with eRegulons
    eRegulon_metadata_key: str, optional
        Key where the eRegulon metadata dataframe is stored
    subset_eRegulons: list, optional
        List of eRegulons to subset
    subset_regions: list, optional
        List of regions to subset
    subset_genes: list, optional
        List of genes to subset
    add_differential_gene_expression: bool, optional
        Whether to calculate differential gene expression logFC for a given variable
    add_differential_region_accessibility: bool, optional
        Whether to calculate differential region accessibility logFC for a given variable
    differential_variable: list, optional
        Variable to calculate differential gene expression or region accessibility.

    Return
    ---------
    A dictionary with edge feature tables ('TF2G', 'TF2R', 'R2G') and node feature tables ('TF', 'Gene', 'Region')
    """
    er_metadata = scplus_obj.uns[eRegulon_metadata_key].copy()
    if subset_eRegulons is not None:
        # subset_eRegulons = [x + '_[^a-zA-Z0-9]' for x in subset_eRegulons]
        er_metadata = er_metadata[er_metadata['Region_signature_name'].str.contains(
            '|'.join(subset_eRegulons))]
    if subset_regions is not None:
        er_metadata = er_metadata[er_metadata['Region'].isin(subset_regions)]
    if subset_genes is not None:
        er_metadata = er_metadata[er_metadata['Gene'].isin(subset_genes)]
    nx_tables = {}
    nx_tables['Edge'] = {}
    nx_tables['Node'] = {}
    # Generate edge tables
    r2g_columns = [x for x in er_metadata.columns if 'R2G' in x]
    tf2g_columns = [x for x in er_metadata.columns if 'TF2G' in x]
    nx_tables['Edge']['TF2R'] = er_metadata[er_metadata.columns.difference(
        r2g_columns + tf2g_columns)].drop('Gene', axis=1).drop_duplicates()
    nx_tables['Edge']['TF2R'] = nx_tables['Edge']['TF2R'][['TF', 'Region'] +
                                                          nx_tables['Edge']['TF2R'].columns.difference(['TF', 'Region']).tolist()]
    nx_tables['Edge']['R2G'] = er_metadata[er_metadata.columns.difference(
        tf2g_columns)].drop('TF', axis=1).drop_duplicates()
    nx_tables['Edge']['R2G'] = nx_tables['Edge']['R2G'][['Region', 'Gene'] +
                                                        nx_tables['Edge']['R2G'].columns.difference(['Region', 'Gene']).tolist()]
    nx_tables['Edge']['TF2G'] = er_metadata[er_metadata.columns.difference(
        r2g_columns)].drop('Region', axis=1).drop_duplicates()
    nx_tables['Edge']['TF2G'] = nx_tables['Edge']['TF2G'][['TF', 'Gene'] +
                                                          nx_tables['Edge']['TF2G'].columns.difference(['TF', 'Gene']).tolist()]
    # Generate node tables
    tfs = list(set(er_metadata['TF']))
    nx_tables['Node']['TF'] = pd.DataFrame(
        'TF', index=tfs, columns=['Node_type'])
    nx_tables['Node']['TF']['TF'] = tfs
    genes = list(set(er_metadata['Gene']))
    genes = [x for x in genes if x not in tfs]
    nx_tables['Node']['Gene'] = pd.DataFrame(
        'Gene', index=genes, columns=['Node_type'])
    nx_tables['Node']['Gene']['Gene'] = genes
    regions = list(set(er_metadata['Region']))
    nx_tables['Node']['Region'] = pd.DataFrame(
        'Region', index=regions, columns=['Node_type'])
    nx_tables['Node']['Region']['Region'] = regions
    # Add gene logFC
    if add_differential_gene_expression is True:
        for var in differential_variable:
            nx_tables['Node']['TF'] = pd.concat([nx_tables['Node']['TF'], _get_log2fc_nx(
                scplus_obj, var, nx_tables['Node']['TF'].index.tolist(), contrast='gene')], axis=1)
            nx_tables['Node']['Gene'] = pd.concat([nx_tables['Node']['Gene'], _get_log2fc_nx(
                scplus_obj, var, nx_tables['Node']['Gene'].index.tolist(), contrast='gene')], axis=1)
    if add_differential_region_accessibility is True:
        for var in differential_variable:
            nx_tables['Node']['Region'] = pd.concat([nx_tables['Node']['Region'], _get_log2fc_nx(
                scplus_obj, var, nx_tables['Node']['Region'].index.tolist(), contrast='region')], axis=1)
    return nx_tables

nx_table = create_nx_tables(
    scplus_obj = scplus_obj,
    eRegulon_metadata_key ='direct_e_regulon_metadata_filtered',
    subset_eRegulons = ['atf3', 'bach2b'],
    subset_regions = hvr,
    subset_genes = hvg,
    add_differential_gene_expression = True,
    add_differential_region_accessibility = True,
    differential_variable = ['ident']
)

G, pos, edge_tables, node_tables = create_nx_graph(nx_table, 
                   use_edge_tables = ['TF2R','R2G'],
                   color_edge_by = {'TF2R': {'variable' : 'TF', 'category_color' : {'atf3': 'Red', 'bach2b': 'Blue'}},
                                    'R2G': {'variable' : 'importance_x_rho', 'continuous_color' : 'viridis', 'v_min': -1, 'v_max': 1}},
                   transparency_edge_by =  {'R2G': {'variable' : 'importance_R2G', 'min_alpha': 0.1, 'v_min': 0}},
                   width_edge_by = {'R2G': {'variable' : 'importance_R2G', 'max_size' :  1.5, 'min_size' : 1}},
                   color_node_by = {'TF': {'variable': 'TF', 'category_color' : {'atf3': 'Red', 'bach2b': 'Blue'}},
                                    'Gene': {'variable': 'ident_Log2FC_RSNs', 'continuous_color' : 'bwr'},
                                    'Region': {'variable': 'ident_Log2FC_RSNs', 'continuous_color' : 'viridis'}},
                   transparency_node_by =  {'Region': {'variable' : 'ident_Log2FC_RSNs', 'min_alpha': 0.1},
                                    'Gene': {'variable' : 'ident_Log2FC_RSNs', 'min_alpha': 0.1}},
                   size_node_by = {'TF': {'variable': 'fixed_size', 'fixed_size': 30},
                                    'Gene': {'variable': 'fixed_size', 'fixed_size': 15},
                                    'Region': {'variable': 'fixed_size', 'fixed_size': 10}},
                   shape_node_by = {'TF': {'variable': 'fixed_shape', 'fixed_shape': 'ellipse'},
                                    'Gene': {'variable': 'fixed_shape', 'fixed_shape': 'ellipse'},
                                    'Region': {'variable': 'fixed_shape', 'fixed_shape': 'diamond'}},
                   label_size_by = {'TF': {'variable': 'fixed_label_size', 'fixed_label_size': 20.0},
                                    'Gene': {'variable': 'fixed_label_size', 'fixed_label_size': 10.0},
                                    'Region': {'variable': 'fixed_label_size', 'fixed_label_size': 0.0}},
                   layout='kamada_kawai_layout',
                   scale_position_by=250)

plt.figure(figsize=(20,20))
plot_networkx(G, pos)
plt.savefig('figures/TFs_target_genes/eRegulons.pdf')
export_to_cytoscape(G, pos, out_file = os.path.join('figures/network.cys'))

I hope it will work for you as well!

However, I didn't manage to solve the problem with import to cytoscape :(

Best, Dmitrii

yiyelinfeng commented 1 month ago

just change "export_to_cytoscape(G, pos, out_file = os.path.join('figures/network.cyjs'))", it works.