saezlab / ccc_protocols

LIANA x Tensor-cell2cell Protocols
https://ccc-protocols.readthedocs.io
MIT License
1 stars 3 forks source link

Doublet vs Total counts filtering #3

Closed dbdimitrov closed 1 year ago

dbdimitrov commented 1 year ago

We currently filter cells by total counts (very minor since the data is already largely preprocessed), but if we want to stick to best practices best to change it to doublet removal by sample :)

This is also related to the Seurat -> SCE issue because doublet removal in Seurat is not great, while in SCE it's largely comparable to scanpy.

dbdimitrov commented 1 year ago

OK. I ran two different techniques to filter doublets in R and both results in different number of doublets detected than scrublet. We could perhaps just leave the doublet detection code as a best practice and just use total_counts to keep consistency between R & Python.

dbdimitrov commented 1 year ago

Also, doublet detection takes ages in R (even when parallelized)