Closed FrickTobias closed 5 years ago
python linkedsv.py -r genome.fa -d LinkedSV/ -i 19x-BLR-bulk-O1_2-filt.bam -t 20
Hi,
LinkedSV uses the HP
tag to read haplotype information and the BX
to read the barcode for a read/alignment. What kind of data do you have and are haplotype information included in the bam file? If yes, I can change the code so that users can specify a tag name for barcode and haplotype information.
Best, Li
I can quite easily change a tag so just having them clearly listed is enough for me, thanks for the offer though. Maybe the required input format could be written somewhere, like "sorted BAM file with a BX
and a HP
tag as output from Longranger " if someone else also wants to expand on the intended use.
Ok. By the way, please make sure you use the lariat aligner (https://github.com/10XGenomics/lariat) to generate the bam file. The lariat aligner considers barcode and has a better mapping in regions where traditional short-read aligners perform badly. If you use bwa-mem or the other aligners, there may be false mapping issues, which cause false-positive SV calls.
Thank you for the advice.
Problem
I am getting a
ValueError: data must be 2 dimensions
when running LinkedSV (see below for full error message & context).Possible solution
I am running this on data which is not taken from a Longranger output but rather a custom pipeline and such it does not have all SAM tags one might expect in a Longranger output. From searching the GitHub directory I found multiple mentions of what seems like a
HP
tag, is this somthing LinkedSV requires?If this is the solution, is there any other tag requirements I should be aware of that I would need?
10x haplotyping tags
Taken from the 10x Genomics homepage.
Explicit error
From row 343283 in
stderr
output: