adamewing / bamsurgeon

tools for adding mutations to existing .bam files, used for testing mutation callers
MIT License
233 stars 86 forks source link

addsnv.py Empty output bam file #151

Open HeXY0515 opened 4 years ago

HeXY0515 commented 4 years ago

Hi Adam,

 I got an empty bam file after addsnv.py. The command is :

addsnv.py --varfile simu_snv_sites.txt --bamfile simu_N.bam --reference ucsc.hg19.fasta --procs 12 --maxdepth 4000 --coverdiff 0.5 --picardjar picard.jar --aligner mem --seed 113 --outbam simu_T.bam > addsnv_log.txt 2>&1

Size of the input bam file is 5458513394. 500 snv sites are generated with the command: randomsites.py --genome ucsc.hg19.fasta --bed Illumina_Target.bed --seed 111 --numpicks 500 --avoidN --minvaf 0.10 --maxvaf 0.5 snv > simu_snv_sites.txt

The last record in log file (addsnv_log.txt) is: contig mismatch: chr1 Target and donor are aligned to incompatable reference genomes!

would you please help me out? I can send you the log file if you need it. Thanks!

adamewing commented 4 years ago

Hi, so the genome .fasta passed via --reference needs to be the same as the reference used to gemerate the .bam file passed to --bamfile. If you don't have the matching .fasta you can try making the reference you have match the .bam using the script in scripts/match_fasta_to_bam.py, although it won't work in all cases. Hope that helps.

HeXY0515 commented 4 years ago

Thanks, Adam! I am sure the reference genome (--reference) is the same as the reference genome used to generate the .bam file.

I change the input as a WES data for simulation, and randomly inserted 500 snvs (generated by randomsites.py) and 15 indels(generated by randomsites.py). The result was that 481 snvs could be successfully inserted(checked by makevcf.py), but 0 indels were successfully inserted( checked by makevcf.py). I checked the indel logs in detail and there were no warnings and errors.

I have sent the log file of addindel.py to you. I am looking forward to receiving your reply. Thanks!