yelabucsf / scrna-parameter-estimation

Direct estimation of mean and covariance from single cell RNA seq experiments
MIT License
76 stars 6 forks source link

Questions about the capture_rate parameter #24

Open karenlawwc opened 1 month ago

karenlawwc commented 1 month ago

Hi, thanks for developing this package! I am wondering how to define the capture_rate parameter if the adata is comprise of different cellranger runs, how would you recommend to find and define s, the sequencing saturation?

Thanks!

yu-tong-wang commented 3 weeks ago

I think you can find the sequencing saturation from web_summary.html of cellranger.

capture_rate can be set to be cell-specific, I think. We just need to NOT to use the wrapper function, and specify the cell-specific or batch-specific capture_rate based on the cellranger run covariate. You can check out the function below (memento/wrappers.py).

def binary_test_1d(adata, capture_rate, treatment_col, num_cpus, num_boot=5000, verbose=1, replicates=[]):
    """
    Wrapper function for comparing the mean and variability for two groups of cells.
    """

    adata = adata.copy().copy()
    adata.obs['capture_rate'] = capture_rate
    memento.setup_memento(adata, q_column='capture_rate')
    memento.create_groups(adata, label_columns=[treatment_col]+replicates)
    memento.compute_1d_moments(adata, min_perc_group=.9)
    sample_meta = memento.get_groups(adata)[[treatment_col]]
    memento.ht_1d_moments(
        adata, 
        treatment=sample_meta,
        num_boot=num_boot, 
        verbose=verbose,
        num_cpus=num_cpus)
    result_1d = memento.get_1d_ht_result(adata)
    return result_1d