vibansal / ancestry

program to estimate admixture coefficients from individual genotype or sequence data
Other
33 stars 16 forks source link

did not find match for chrom: chr1 in hashtable error #8

Closed avilella closed 5 years ago

avilella commented 5 years ago

Hi again,

I am running ancestry with an hg38 bam file that looks like this:

samtools view -H $bsn | head
@HD     VN:1.5  SO:coordinate
@SQ     SN:chr1 LN:248956422
@SQ     SN:chr2 LN:242193529
@SQ     SN:chr3 LN:198295559
@SQ     SN:chr4 LN:190214555
@SQ     SN:chr5 LN:181538259
@SQ     SN:chr6 LN:170805979
@SQ     SN:chr7 LN:159345973
@SQ     SN:chr8 LN:145138636
@SQ     SN:chr9 LN:138394717

But when I run it with the hg38 .freqs file in DATA/, I am getting this messages:

[...]
reading sorted bamfile /home/ec2-user/data/epi/56001710270833/sample.hg38_merged.deduplicated.bsc.sorted.bam 
did not find match for chrom: chr1 in hashtable 
did not find match for chrom: chr2 in hashtable 
did not find match for chrom: chr3 in hashtable 
processed 2000000 reads, useful fragments 0
did not find match for chrom: chr4 in hashtable 
did not find match for chrom: chr5 in hashtable 
did not find match for chrom: chr6 in hashtable 
did not find match for chrom: chr7 in hashtable 
did not find match for chrom: chr8 in hashtable 
processed 4000000 reads, useful fragments 0

[...]
making input file for ancestry calculations
ancestry admixture calculations for /home/ec2-user/data/epi/56001710270833/sample.hg38_merged.deduplicated.bsc.sorted.bam poolsize is  2
difficult to estimate admixture coefficients with high confidence

final maxval 0.000000 ADMIX_PROP YRI:0.167565 CHB:0.153781 CHD:0.016052 TSI:0.197848 MKK:0.013955 LWK:0.053050 CEU:0.203604 JPT:0.194147 
YRI:0.1676:0.00 CHB:0.1538:0.00 CHD:0.0161:0.00 TSI:0.1978:0.00 MKK:0.0140:0.00 LWK:0.0530:0.00 CEU:0.2036:0.00 JPT:0.1941:0.00 FINAL_ALL_PROPS
YRI:0.1676:0.00 CHB:0.1538:0.00 CHD:0.0161:0.00 TSI:0.1978:0.00 MKK:0.0140:0.00 LWK:0.0530:0.00 CEU:0.2036:0.00 JPT:0.1941:0.00 FINAL_NZ_PROPS bins 200 ssq 0.000000
LL 0.000000 LL_exp 0.000000 0.000000 -nan

Any ideas what the issue might be? Thanks in advance.

avilella commented 5 years ago

I found the solution: it needed --addchr when calling runancestry.py. Thanks!

bapoorva commented 1 month ago

Hi,

I'm having this same issue here. -addchr didn't change anything. The notes say that if we use bam files as input, we have to run CalculateGLL first. How do we do that ?

Thank you