Closed sklages closed 6 years ago
Hi Sven,
There could be a few things going on: STAR calls as unmapped those reads w/ >10 genomic alignments by default, so if you expect >10 alignments to be typical, you'll need to change outFilterMultimapNmax by adding the argument to the STAR call here:
https://github.com/10XGenomics/cellranger/blob/master/lib/python/cellranger/reference.py#L527
Once you've done that, or if the alignment number is typically <10, CR will still not consider these reads for UMI counting but it will report them in its final BAM file.
If you want to realign the data yourself (while preserving cell barcodes and UMIs) you have a few options, but unfortunately none of them are easy.
Convert the final BAM file back to FASTQ. However, most tools don't support preservation of the UMI/barcode tags. This can be done by writing a python script that uses pysam.
Kill the pipeline after EXTRACT_READS finishes but before its children finish. Nested deep under the EXTRACT_READS directory you'll find a set of FASTQ files that contain the barcode and UMI info.
One naive question, how can I kill the pipeline automatically as you mentioned above? I cannot find any fastq file under EXTRACT_READS directory after complete cellranger run.
Hi, this is more a information request than a software issue.
I'd like to use
cellranger
for a custom genome reference. I know that my data will not map uniquely on that genome, I expect multiple hits of many reads. The standardcellranger count
workflow will discard such reads. That would leave me without data at the end ;-)I could not find a way to alter the
STAR
alignment parameters, e.g.outFilterMultimapNmax
. Can you provide some info where I find the parameter setting forSTAR
in the package?Is there a way to get "intermediate" data? An alternative would be to use the umi-collapsed reads from
cellranger
and feed these into some standard aligner likebwa
with subsequent "manual" analysis (I don't need spliced alignment).I'd appreciate any hints ..
best, Sven