Closed prmunn closed 3 years ago
Hi @prmunn, to extract cell barcodes from the read name you can provide a regular expression, in your case something like --barcode_regex "(?<=_)(.*)(?=_)"
should work. When looking into this, I realized that it's currently difficult to match strings that don't start at the beginning of the read name, so I have made a change that should make that easier (you'll need to install from github).
If you set the --barcode_regex
parameter the --barcodetag
parameter is not used (see https://timoast.github.io/sinto/basic_usage.html#create-scatac-seq-fragments-file). You can set --use_chrom ""
to match all chromosomes.
Yep - that worked. Many thanks!
Hi - I'm new to sinto and I'm unable to to produce a fragments file. My cell barcodes are in the header rows of my bam file, between the first and second underscore, and I have numeric values for my chromosomes, with no "chr" at the beginning. So, I need to know what the regex pattern would look like for both the --barcode_regex and --use_chrom options, and I need to know how to stop the --barcodetag using the default of "BC". Here are the first 10 rows of my bam file: