Hoohm / CITE-seq-Count

A tool that allows to get UMI counts from a single cell protein assay
https://hoohm.github.io/CITE-seq-Count/
MIT License
79 stars 44 forks source link

Barcodes from whitelist put into "uncorrected_cells" matrix not barcodes.tsv #61

Closed heathergeiger closed 4 years ago

heathergeiger commented 5 years ago

Do you know what it means when some of the barcodes from the whitelist are not in the "barcodes.tsv" output file, but are instead the column names under uncorrected_cells/dense_umis.tsv?

My whitelists were around 350-420 cells per pool. Of these, ~20-50 of these cells were not in the barcodes.tsv output file.

Hoohm commented 5 years ago

Hello @heathergeiger

There is a filter in CITE-seq-Count that will not correct cells that have at least one TAG with more than 20'000 umis.

There are two reasons for that.

1 Performance issues. Having too many umis to correct for one TAG is taking a really long time (hours/days). 2 Cell aggregation. It seems that a really high number of UMIs for one TAG is a synonym to multiple cells aggregating which might skew your analysis.

For those two reasons, today, those cells are just not corrected. You can still look at the uncorrected reads and check if those cells look fine, but they are probably not useable. If your observation differs on those specific cells, please let me know so that I can check for a long term solution.

Hoohm commented 5 years ago

I'm adding you have a high number of "bad cells". Is this a hashing design?

YunZheHuang commented 5 years ago

@Hoohm We are facing the same problem Actually we found that the cells that under the uncorrected cell file is important to our research and we want to keep them in our final result. Can you please tell us if there is anything we can do now to add those cells back to the final result? Thank you!

Hoohm commented 5 years ago

@YunZheHuang There are three options here.

  1. Test out the feature/index_whitelist branch. It has this change that doesn't correct unmapped umis. It might be possible that you just have a lot of unmapped reads on those cells. This should fix the issue.
  2. Downsample your run. Use the -n option to get maybe 3/4 of your data.
  3. I could add an option exposing the limit for the max number of UMIs to correct and you can run it again. This would allow running the whole thing but it might take days to finish.
Hoohm commented 5 years ago

Hello @YunZheHuang, Any news regarding your issue?

Hoohm commented 4 years ago

Closing it. Feel free to reopen.

colin986 commented 2 years ago

Hi,

I have a question regarding uncorrected cells. In our experiment (approx 6,000 cells) we have one cell that is "uncorrected" (so not a big issue, I just want to understand it a little better) - from this thread, I understand that correction won't happen if UMIs > 20,000. I understand that performance is an issue, but to clarify on the potential of aggregation - are you saying that this could potentially be a sign of a multiple cells with the same tag being captured?

This is the output of dense_umis.tsv

image

Thanks! Colin

Hoohm commented 2 years ago

Hey Colin.

Aggregation is a phenomena where antibodies stick to each other leading to more tags to be captured.

So it's not multiples cells, it's just antibodies sticking to each other.

Hope this helps

On Fri, 14 Jan 2022, 13:29 colin986, @.***> wrote:

Hi,

I have a question regarding uncorrected cells. In our experiment (approx 6,000 cells) we have one cell that is "uncorrected" (so not a big issue, I just want to understand it a little better) - from this thread, I understand that correction won't happen if UMIs > 20,000. I understand that performance is an issue, but to clarify on the potential of aggregation - are you saying that this could potentially be a sign of a multiple cells with the same tag being captured?

This is the output of dense_umis.tsv

[image: image] https://user-images.githubusercontent.com/39644664/149515056-ecbbe6c3-bb79-482b-9965-335d5946a33a.png

Thanks! Colin

— Reply to this email directly, view it on GitHub https://github.com/Hoohm/CITE-seq-Count/issues/61#issuecomment-1013077360, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJVO2CVUMUFCRNFEICFLLLUWAJJNANCNFSM4HUILYIQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you modified the open/close state.Message ID: @.***>