I'm using your tool in the course of haplotype phasing moss contigs.
As a first proof of concept, I called 212 variants (mostly SNPs) with freebayes on a specific contig of ~6,5kb and tried to phase them with hapcut and readphaser. I have a bam file of reads mapped against the contig containing 825 alignments and the necessary output of hapcut containing 142 variants in 3 blocks.
Unfortunately, when I run rp.py like this:
[phase_reads] opening alignment BAM file... done.
[phase_reads] phasing 'NODE_7344_length_6467_cov_3.70633_ID_13259', block_id 0
[error] inconsistent totals for contig NODE_7344_length_6467_cov_3.70633_ID_13259# contig='NODE_7344_length_6467_cov_3.70633_ID_13259' block_id=0 indels=36 no_overlap=460 phased=325 unmapped=2 filtered=36 total=825
NODE_7344_length_6467_cov_3.70633_ID_13259 0 539 GTGATGGG:6;AAGATTGT:2
...
NODE_7344_length_6467_cov_3.70633_ID_13259 0 2863 C:6;T:12
[phase_reads] phasing 'NODE_7344_length_6467_cov_3.70633_ID_13259', block_id 1
Traceback (most recent call last):
File "/home/vangessel/Software/readphaser/pr.py", line 364, in <module>
output_main(args)
File "/home/vangessel/Software/readphaser/pr.py", line 334, in output_main
region=args.region)
File "/home/vangessel/Software/readphaser/pr.py", line 297, in phase_reads
group_reads_by_block(reads, block, block_id, callback, stats, inconsistent_counts_file)
File "/home/vangessel/Software/readphaser/pr.py", line 211, in group_reads_by_block
htype_pos = group_varpos_by_haplotype(readpair, haplotypes, allele_counts)
File "/home/vangessel/Software/readphaser/pr.py", line 147, in group_varpos_by_haplotype
read_var = read.query[interval[0]-read.pos:interval[1]-read.pos]
TypeError: 'NoneType' object has no attribute '__getitem__'
It stops right where the second of the three blocks starts and thus fills the phased.fq with reads for all variants included in the first block. As I am no python expert, I'm not sure how to track down this error. The reads in the given region and their alignments don't look suspicious.
When I split up the blocks into individual input files, the first block gets processed successfully. However, the second and third block do not.
Do you have any idea?
If it's necessary, I can provide you with the input files. As they are only for one contig, they are rather small in size...
I would very much appreciate your help.
Regards,
Nico
Hey Vince,
I'm using your tool in the course of haplotype phasing moss contigs. As a first proof of concept, I called 212 variants (mostly SNPs) with freebayes on a specific contig of ~6,5kb and tried to phase them with hapcut and readphaser. I have a bam file of reads mapped against the contig containing 825 alignments and the necessary output of hapcut containing 142 variants in 3 blocks. Unfortunately, when I run rp.py like this:
I get the following error message:
It stops right where the second of the three blocks starts and thus fills the phased.fq with reads for all variants included in the first block. As I am no python expert, I'm not sure how to track down this error. The reads in the given region and their alignments don't look suspicious.
When I split up the blocks into individual input files, the first block gets processed successfully. However, the second and third block do not.
Do you have any idea? If it's necessary, I can provide you with the input files. As they are only for one contig, they are rather small in size...
I would very much appreciate your help. Regards, Nico