Closed XuezhenChen closed 2 years ago
Hi @XuezhenChen,
Thanks for the kind words. There are a few metrics you can use without labels. The kBET metric we adapted to use labels though so that we can correct for cell type composition differences. PCR_batch and graph iLISI are the batch removal metrics that don't require labels to run. On the bio conservation side, trajectory conservation, cell cycle conservation and HVG conservation don't require cell type labels.
Regarding replacing label_key
with a clustering output... this is possible... but I would definitely verify the clusters. In the end the level of cluster annotation is what you will evaluate recovery of. If you base your clustering on one integrated embedding/graph, then you will bias all your evaluations towards that embedding. Better might be to cluster per batch and then map clusters to one another by correlation or marker genes.
Good luck!
Hello @LuckyMD,
Thanks for the clarification on kBET. We'll proceed with the metrics that don't require cell labels to run as you mentioned. Thanks again for your help!
Dear authors,
Thanks for the great work! The scIB tool has been a great resource for our team. We're trying to evaluate batch removal effects on several integrated datasets. So far
scib-pipeline/scripts/metrics/metrics.py
worked well. However, inmetrics.py
--label_key
is required. I'm wondering if it's reasonable to input data without annotated cell label for metrics calculation? (We would like to evaluate batch effect correction before we move into manual annotation.)I've considered the followings:
label_key
?label_key
only (e.g. from scIB.metrics import kbet) and do the calculation separately.I'm new to this area and it would be great if you could give me some advice. Thanks for your time!