phbradley / conga

Clonotype Neighbor Graph Analysis
MIT License
80 stars 18 forks source link

Question about the `_compute_graph_overlap_stats()` function #66

Open xiachenrui opened 7 months ago

xiachenrui commented 7 months ago

Hi all,

I found following code in _compute_graph_overlap_stats() function, but I can not understand it. How do you calculate the f'log10_overlap_sdev, log10_expected_overlap and total_log10_indegree_variance, and what do them mean? I can not found more information in the manuscript. Any explaination will be useful.

    ## params for log10_s determined with
    ## statsmodels.formula.api.ols(f'log10_overlap_sdev ~
    ##     log10_expected_overlap + total_log10_indegree_variance,...)
    # Intercept                       -0.340085
    # log10_expected_overlap           0.691433
    # total_log10_indegree_variance    0.253497
    total_log10_indegree_variance = (
        np.log10(gex_indegree_bias_stats.variance)+
        np.log10(tcr_indegree_bias_stats.variance))
    log10_s_fitted = (0.691433 * np.log10(expected_overlap)
                      +0.253497 * total_log10_indegree_variance
                      -0.340085)
    s_fitted = 10**log10_s_fitted