aertslab / scenicplus

SCENIC+ is a python package to build gene regulatory networks (GRNs) using combined or separate single-cell gene expression (scRNA-seq) and single-cell chromatin accessibility (scATAC-seq) data.
Other
162 stars 27 forks source link

Branch missing scenicplus.cistromes modules #418

Open AmosFong1 opened 2 weeks ago

AmosFong1 commented 2 weeks ago

I recently completed the updated scenicplus workflow (snakemake version) and realized that the functions to calculate TF expression and region/gene AUC correlations are missing. Is there a function to do this still similar to the functions in from scenicplus.cistromes import TF_cistrome_correlation, generate_pseudobulks? If not can you provide me with details about which statistical test to use?

SeppeDeWinter commented 2 weeks ago

Hi @AmosFong1

Yes, these functions still exist.

You can import them like this


from scenicplus.regulon_qc.quality_metrics import generate_pseudobulks, calculate_correlation

generate_pseudobulks uses the following parameters:

This function will produce a pandas DataFrame.

calculate_correlation takes two pandas dataframes as input. Note that the index of these dataframes should match (for example they should represent the same cells / pseudobulks). It will then calculate the correlation across regulons (columns).

I hope this helps?

Best,

Seppe

AmosFong1 commented 2 weeks ago

Hi @SeppeDeWinter ,

Thanks for the quick reply, can you clarify what is the parameter mapping_A_to_B in calculate_correlation? Can't seem to find it on the SCENIC+ read the docs, and running calculate_correlation? indicates dict.

Best, Amos

SeppeDeWinter commented 2 weeks ago

Hi @AmosFong1

Right! It's a dictionary that maps the features (columns) in the A dataframe to those in B. For example, when calculating the correlation between region and gene values the features might look like this:

Gene based: TF_+_+(xg) Region based : TF_+_+(xr)

Then the mapping would be {"TF_+_+(xg)": "TF_+_+(xr)"}.

All the best,

Seppe