NKI-GCF / featureseek

QC feature barcodes in 10X Feature Barcode FastQ files
MIT License
0 stars 0 forks source link

CCTAATGGTCCAGAC ignored along with GGGGGGGGGGGGGG? #1

Closed xmao closed 5 months ago

xmao commented 8 months ago

Wondering what CCTAATGGTCCAGAC is?

veldsla commented 8 months ago

We sometimes see these sequences in relatively high numbers. The GGGG thing could be sequencing related. If you google the other sequence you find a hit in a document about feature barcode antibody conjugation kit. I think it is some kind of by-product or contamination.

You can check the number of ignored reads in the report. If you have a high number you can count them by running featureseek with -x "" -u

This program needs some more work. Better filtering options and really some sort of valid barcode detection would be nice. Let me know if you think it is useful

veldsla commented 5 months ago

The issue has been picked up by 10X Genomics support they asked me to post the following message:

10x Genomics has requested that anyone who has seen high levels of this sequence in their data should contact technical support at: support@10xgenomics.com so they can review the data with you."