openproblems-bio / openproblems

Formalizing and benchmarking open problems in single-cell genomics
MIT License
308 stars 77 forks source link

Combat (scaled) returns NaN in PC Regression #683

Closed scottgigante-immunai closed 1 year ago

scottgigante-immunai commented 1 year ago

Unclear why. See raw results output at https://github.com/openproblems-bio/openproblems/actions/runs/3463236911. cc @LuckyMD

LuckyMD commented 1 year ago

Discussion: https://github.com/openproblems-bio/website/pull/45

danielStrobl commented 1 year ago

The PCR metric uses adata.X to compute the metric before integration, which combats overwrites. The metric value for combat scaled returns 0, thus it tries to compute 0/0. Combat should probably be in the _feature subtask and not the _embedding one

danielStrobl commented 1 year ago

https://github.com/openproblems-bio/openproblems/pull/784

LuckyMD commented 1 year ago

It should be in both, no? PCA on the features then? I think the solution would more be to distinguish the place the corrected features and the original features are stored consistently. Could you do that?

LuckyMD commented 1 year ago

So this is an issue for all feature outputs then, no? Also combat unscaled, and seurat for example?

danielStrobl commented 1 year ago

I think the PR I linked fixes that. It places the original feature matrix in adata_pre.X and not the corrected one

LuckyMD commented 1 year ago

Would that overwrite the combat results?

scottgigante-immunai commented 1 year ago

I don't think so -- the function edited in #784 is only for batch_integration_embed