Closed heathergeiger closed 4 years ago
Hello @heathergeiger
There is a filter in CITE-seq-Count that will not correct cells that have at least one TAG with more than 20'000 umis.
There are two reasons for that.
1 Performance issues. Having too many umis to correct for one TAG is taking a really long time (hours/days). 2 Cell aggregation. It seems that a really high number of UMIs for one TAG is a synonym to multiple cells aggregating which might skew your analysis.
For those two reasons, today, those cells are just not corrected. You can still look at the uncorrected reads and check if those cells look fine, but they are probably not useable. If your observation differs on those specific cells, please let me know so that I can check for a long term solution.
I'm adding you have a high number of "bad cells". Is this a hashing design?
@Hoohm We are facing the same problem Actually we found that the cells that under the uncorrected cell file is important to our research and we want to keep them in our final result. Can you please tell us if there is anything we can do now to add those cells back to the final result? Thank you!
@YunZheHuang There are three options here.
-n
option to get maybe 3/4 of your data.Hello @YunZheHuang, Any news regarding your issue?
Closing it. Feel free to reopen.
Hi,
I have a question regarding uncorrected cells. In our experiment (approx 6,000 cells) we have one cell that is "uncorrected" (so not a big issue, I just want to understand it a little better) - from this thread, I understand that correction won't happen if UMIs > 20,000. I understand that performance is an issue, but to clarify on the potential of aggregation - are you saying that this could potentially be a sign of a multiple cells with the same tag being captured?
This is the output of dense_umis.tsv
Thanks! Colin
Hey Colin.
Aggregation is a phenomena where antibodies stick to each other leading to more tags to be captured.
So it's not multiples cells, it's just antibodies sticking to each other.
Hope this helps
On Fri, 14 Jan 2022, 13:29 colin986, @.***> wrote:
Hi,
I have a question regarding uncorrected cells. In our experiment (approx 6,000 cells) we have one cell that is "uncorrected" (so not a big issue, I just want to understand it a little better) - from this thread, I understand that correction won't happen if UMIs > 20,000. I understand that performance is an issue, but to clarify on the potential of aggregation - are you saying that this could potentially be a sign of a multiple cells with the same tag being captured?
This is the output of dense_umis.tsv
[image: image] https://user-images.githubusercontent.com/39644664/149515056-ecbbe6c3-bb79-482b-9965-335d5946a33a.png
Thanks! Colin
— Reply to this email directly, view it on GitHub https://github.com/Hoohm/CITE-seq-Count/issues/61#issuecomment-1013077360, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJVO2CVUMUFCRNFEICFLLLUWAJJNANCNFSM4HUILYIQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you modified the open/close state.Message ID: @.***>
Do you know what it means when some of the barcodes from the whitelist are not in the "barcodes.tsv" output file, but are instead the column names under uncorrected_cells/dense_umis.tsv?
My whitelists were around 350-420 cells per pool. Of these, ~20-50 of these cells were not in the barcodes.tsv output file.