Closed YiweiNiu closed 1 year ago
Hi @YiweiNiu , this is a good observation! :)
But there is a reason for what you're seeing (I hope!).
The function load_anndata_from_input_and_output()
by default loads only the droplets that were analyzed by cellbender. So the droplets past --total-droplets-included
(or its default) are not included when the data is loaded.
I think that is the difference!
If you want load_anndata_from_input_and_output()
to load all the droplets, you can use load_anndata_from_input_and_output(analyzed_barcodes_only=False)
(https://cellbender.readthedocs.io/en/v0.3.0/reference/index.html#cellbender.remove_background.downstream.load_anndata_from_input_and_output). Then you should see that the total counts per gene agree with the raw data loaded by scanpy.
Let me know if this is not the case!
Thank you for your quick reply. Yes, it matches when using analyzed_barcodes_only=False
parameter.
Hi,
I am testing cellbender v0.3.0 following the tutorial, but I find the
n_raw
reported by cellbender is different from the raw counts of the input. Here is the testThe
n_raw
of gene Ptma reported by cellbender is 162136, while the one counted from the input is 239060. Thenp.ravel(X.sum(axis=0))
is from scanpy code.I also checked this with R, and found the one calculated by scanpy is correct. Maybe this is a tiny thing. Or this metric is also used in other analysis besides the html report.
Best regards, Yiwei