Open npklein opened 1 year ago
Hi @npklein , yes I am still working on the automated warnings in the report... I think I am being a bit too aggressive in saying "WARNING: This deviates from expectations ..." Right now I am computing the Pearson R correlation coefficient for that scatter plot. But I might change it a bit, so that it's a more robust fit that weights the highly expressed genes more heavily. You can see (and I've seen the same thing) that those scatterplots, while they're not a perfect y = x
line, they are still very correlated. And the idea here is not to get a perfect y = x
line. It's just to see whether there is some kind of ballpark correlation between naive expectations of "removing what's in the empty droplets" versus what the tool actually did.
Basically the only corrective action for that plot would be to see if the empty droplets seem to have been identified correctly. There are times when CellBender's automated heuristics can be fooled, and maybe CellBender thinks the wrong part of the UMI curve is empties. In that case, this scatterplot might not look very correlated at all, and it might be a sign that you need to supply --expected-cells
or --total-droplets-included
input arguments.
As far as the learning curves, the first one looks awesome, and the second one looks not awesome. :) I am actively working on trying to come up with ways to prevent that from happening, but yes, right now the best bet is to reduce the learning rate.
You can try --learning-rate 1e-5 --epochs 300
if you want. I know that produces good results for some people, though it does take longer to train!
Several tweaks have been made very recently that should hopefully clear this up for you in v0.3.0
Potentially closed by #238
Thanks for developing this tool!
I am looking through the reports of a couple of my samples, and all of them seem to have the following warning
However, I'm not sure what this might indicate (e.g., should I change parameters?), and couldn't find it in your troubleshooting section.
For some samples the other QC plots seem to look good, for example:
While for others the the learning curve is also not looking good (but here the report gives indication what to try, i.e. lowering learning rate)
Do you have some additional info on this warning?
Thanks!