theislab / scib

Benchmarking analysis of data integration tools
MIT License
294 stars 63 forks source link

Some questions related to metrics here #255

Closed HelloWorldLTY closed 3 years ago

HelloWorldLTY commented 3 years ago

I wonder for some metrics, for example, iLISI, asw and kBET. If they meet datasets with batch specific cell type data (for example, cells with one cell type and only exist in one batch), will it be reasonable if I still use the raw method or method existing in scIB.

Researchers in this paper (https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1850-9) suggest that we may need to remove the batch specific cells and keep the common cells remained. Is it reasonable? I am not quite sure. Thanks.

LuckyMD commented 3 years ago

Hi @ChineseBest,

We have adapted kBet and ASW to work per cell type so that this is not an issue. I'm not sure what the original metric would do or how you intend to use it here. For iLISI it might make sense to ignore these cell types. However, it probably won't matter too much as batch-specific cell types would affect every method in the same way and the metric is applied to all outputs in the end. It might that some methods get a slight iLISI advantage if they mix unrelated cells, but other metrics will pick this up and down-rank that output. This is the power of having so many metrics in the end.