theislab / scib

Benchmarking analysis of data integration tools
MIT License
283 stars 62 forks source link

Computing Metrics on Seurat Objects #411

Open VishD17 opened 2 weeks ago

VishD17 commented 2 weeks ago

Hello,

Thank you for this great tool! We had some questions on how the scib metrics are computed on seurat objects.

1) We converted seurat objects to h5ad using sceasy in Jupyter through an R environment. Then, we opened these h5ad objects in a Python environment and directly ran the scib metrics wrapper function without any pre-processing. That's when we noticed abnormally low pcr comparison scores, which we believe to be a consequence of recomputing pca & not utilizing the 'X_pca' embedding. Our solution (shown below) was to change recompute_pca to True for both pcr_before & pcr_after and add "embed" = "X_pca" for pcr_before & pcr_after. Do you think we taking the correct approach?

2) We also noticed that when scale is false, the score is calculated pcr_after - pcr_before instead of pcr_before - pcr_after like the scaled score. Is this a bug?

3) For what other metrics are PCs being recomputed by default? Would this be an issue for other metrics e.g kbet as well?

Thank you for the help!

# if embed == "X_pca":
#       embed = None

#Individually Calculating PC_Reg Comparison Score
    pcr_before = scib.metrics.pcr(
        adata_unint,
        covariate="species",
        embed = "X_pca", 
        recompute_pca=False,
        n_comps=50,
        verbose=False
    )

    pcr_after = scib.metrics.pcr(
        exp_1a,
        covariate="species",
        embed="X_pca",
        recompute_pca=False,
        n_comps=50,
        verbose=False
    )

pcr_comp_score = (pcr_before - pcr_after)/pcr_before