cancerit / alleleCount

Support code for NGS copy number algorithms. Takes a file of locations and a [cr|b]am file and generates a count of coverage of each allele [ACGT] at that location (given any filter settings)
http://cancerit.github.io/alleleCount/
GNU Affero General Public License v3.0
43 stars 8 forks source link

10x data #47

Closed thelingxichen closed 4 years ago

thelingxichen commented 6 years ago

Hi,

I am using allelecount-4.0.0, and I test the module with 10X data:

alleleCounter \ 
    -l alleleCount-4.0.0/testData/test10X.loci \
    -b alleleCount-4.0.0/testData/test10X.bam \
    -o test \
    -m 20 \
    -r /home/BIOINFO_DATABASE/database_for_LongRanger/refdata-hg19-2.1.0/fasta/genome.fa.fai \
    -d 

And the test file output:

#CHR    POS     Count_A Count_C Count_G Count_T Good_depth
1       198661939       0       0       0       0       0 

Yeah, I use longranger hg19 reference with 'chr' prefix in contigs, but it works well in test.bam data.

So, is there any wrong for 10X mode?

Meanwhile, my data is wgs 10X data, which barcode tag information is different from your provided 10Xtest, is this okay?

Thanks

keiranmraine commented 6 years ago

You will need to pre-process the loci file to include the chr prefix where appropriate.

As per the help:

 -x  --is-10x                    Enables 10X processing mode.
                                   In this mode the HTS input file must be a cellranger produced BAM file.  Allele
                                   counts are then given on a per-cellular barcode basis, with each count representing
                                   the consensus base for that UMI. 

I don't personally know if the barcode information is stored in the same tags for longranger. If this doesn't work it may be incompatible.