adamewing / bamsurgeon

tools for adding mutations to existing .bam files, used for testing mutation callers
MIT License
231 stars 86 forks source link

addsv Error missing bwatmp file #175

Open fre335 opened 3 years ago

fre335 commented 3 years ago

Hi there! im trying to simulate SVs but I keep running into an Error:

FileNotFoundError: [Errno 2] No such file or directory: 'addsv.tmp/chr22_33871043_33874754_DEL.7b6f5b07-0251-4666-b0ff-a63998cff574.muts.bam'

It seems that the error already appears in a previous step as the file "bwatmp.7875ea41-f81d-41b7-bf1b-ea54dfe6490a.sam" is empty.

This is the command I used:

python3 ../bamsurgeon/bin/addsv.py -v SV_del.bed -f NA24143.wgs.hg38.chr22.bam -r variant_sim/hg38_canonical.fa.fasta -l 100 --aligner mem --debug -o NA24143.wgs.chr22_simulated.bam &>NA24143_wgs_del.log

This is how my bed file looks:

chr22 33871043 33874754 DEL 0.75

This is the output:

`WARNING 2021-02-09 15:22:58,856 cannot find mate for read marked paired: D00360:94:H2YT5BCXX:1:2207:10181:21772 INFO 2021-02-09 15:23:10,989 found mates for 1000 reads, 0.00 discordant. INFO 2021-02-09 15:23:12,722 found 1065 reads in region. [0.000009] Reading FastA file addsv.tmp/chr22_33871043_33874754_DEL.tmpreads.71817fd1-f0b2-49e5-b2b5-fe3571f5d529.fasta; [0.167270] 1096 sequences found [0.167287] Done [0.170336] Reading read set file addsv.tmp/chr22_33871043_33874754_DEL.df07c15a/Sequences; [0.183409] 1096 sequences found [0.192503] Done [0.192517] 1096 sequences in total. [0.197740] Writing into roadmap file addsv.tmp/chr22_33871043_33874754_DEL.df07c15a/Roadmaps... [0.201799] Inputting sequences... [0.201817] Inputting sequence 0 / 1096 [0.415459] === Sequences loaded in 0.213761 s [0.416716] Done inputting sequences [0.416731] Destroying splay table [0.439146] Splay table destroyed [0.000010] Reading roadmap file addsv.tmp/chr22_33871043_33874754_DEL.df07c15a/Roadmaps [0.019355] 1096 roadmaps read [0.019439] Creating insertion markers [0.019598] Ordering insertion markers [0.022015] Counting preNodes [0.022173] 2390 preNodes counted, creating them now [0.036117] Adjusting marker info... [0.036300] Connecting preNodes [0.039974] Cleaning up memory [0.039991] Done creating preGraph [0.040001] Concatenation... [0.040699] Renumbering preNodes [0.040711] Initial preNode count 2390 [0.040784] Destroyed 1313 preNodes [0.040795] Concatenation over! [0.040805] Clipping short tips off preGraph [0.040907] Concatenation... [0.041089] Renumbering preNodes [0.041102] Initial preNode count 1077 [0.041153] Destroyed 339 preNodes [0.041164] Concatenation over! [0.041173] 178 tips cut off [0.041184] 738 nodes left [0.045900] Writing into pregraph file addsv.tmp/chr22_33871043_33874754_DEL.df07c15a/PreGraph... [0.069851] Reading read set file addsv.tmp/chr22_33871043_33874754_DEL.df07c15a/Sequences; [0.070397] 1096 sequences found [0.076816] Done [0.082083] Reading pre-graph file addsv.tmp/chr22_33871043_33874754_DEL.df07c15a/PreGraph [0.084935] Graph has 738 nodes and 1096 sequences [0.091187] Scanning pre-graph file addsv.tmp/chr22_33871043_33874754_DEL.df07c15a/PreGraph for k-mers [0.093176] 16125 kmers found [0.096812] Sorting kmer occurence table ... [0.103009] Sorting done. [0.103018] Computing acceleration table... [0.145077] Computing offsets... [0.145194] Ghost Threading through reads 0 / 1096 [0.169069] === Ghost-Threaded in 0.023874 s [0.169093] Threading through reads 0 / 1096 [0.188364] === Threaded in 0.019272 s [0.202754] Correcting graph with cutoff 0.200000 [0.202809] Determining eligible starting points [0.203444] Done listing starting nodes [0.203453] Initializing todo lists [0.203509] Done with initilization [0.203521] Activating arc lookup table [0.203560] Done activating arc lookup table [0.217556] Concatenation... [0.217584] Renumbering nodes [0.217595] Initial node count 738 [0.217610] Removed 581 null nodes [0.217621] Concatenation over! [0.217632] Clipping short tips off graph, drastic [0.217656] Concatenation... [0.217705] Renumbering nodes [0.217717] Initial node count 157 [0.217729] Removed 14 null nodes [0.217740] Concatenation over! [0.217750] 143 nodes left [0.223701] Writing into graph file addsv.tmp/chr22_33871043_33874754_DEL.df07c15a/Graph2... [0.258715] Measuring median coverage depth... [0.258773] Median coverage depth = 14.542781 [0.260745] Removing contigs with coverage < 7.271390... [0.260863] Concatenation... [0.261068] Renumbering nodes [0.261079] Initial node count 143 [0.261089] Removed 135 null nodes [0.261099] Concatenation over! [0.261109] Concatenation... [0.261119] Renumbering nodes [0.261129] Initial node count 8 [0.261138] Removed 0 null nodes [0.261148] Concatenation over! [0.261158] Clipping short tips off graph, drastic [0.261168] Concatenation... [0.261178] Renumbering nodes [0.261187] Initial node count 8 [0.261197] Removed 0 null nodes [0.261206] Concatenation over! [0.261216] 8 nodes left [0.261225] Read coherency... [0.261235] Identifying unique nodes [0.261245] Done, 0 unique nodes counted [0.261254] Trimming read tips [0.261264] Renumbering nodes [0.261273] Initial node count 8 [0.261282] Removed 0 null nodes [0.261292] Confronted to 0 multiple hits and 0 null over 0 [0.261479] Read coherency over! [0.261498] Starting pebble resolution... [0.261513] Computing read to node mapping array sizes [0.261534] Computing read to node mappings [0.261564] Estimating library insert lengths... [0.261582] Done [0.261592] Computing direct node to node mappings [0.261611] Scaffolding node 0 [0.261629] === Nodes Scaffolded in 0.000018 s [0.261641] Preparing to correct graph with cutoff 0.200000 [0.261672] Cleaning memory [0.261683] Deactivating local correction settings [0.261697] Pebble done. [0.261707] Starting pebble resolution... [0.261723] Computing read to node mapping array sizes [0.261740] Computing read to node mappings [0.261759] Estimating library insert lengths... [0.261777] Done [0.261787] Computing direct node to node mappings [0.261805] Scaffolding node 0 [0.261824] === Nodes Scaffolded in 0.000019 s [0.261834] Preparing to correct graph with cutoff 0.200000 [0.261859] Cleaning memory [0.261869] Deactivating local correction settings [0.261883] Pebble done. [0.261893] Concatenation... [0.261903] Renumbering nodes [0.261912] Initial node count 8 [0.261922] Removed 0 null nodes [0.261932] Concatenation over! [0.261941] Removing reference contigs with coverage < 7.271390... [0.261952] Concatenation... [0.261962] Renumbering nodes [0.261971] Initial node count 8 [0.261981] Removed 0 null nodes [0.261990] Concatenation over! [0.268407] Writing contigs into addsv.tmp/chr22_33871043_33874754_DEL.df07c15a/contigs.fa... [0.277561] Writing into stats file addsv.tmp/chr22_33871043_33874754_DEL.df07c15a/stats.txt... [0.284527] Writing into graph file addsv.tmp/chr22_33871043_33874754_DEL.df07c15a/LastGraph... [0.295381] Writing into AMOS file addsv.tmp/chr22_33871043_33874754_DEL.df07c15a/velvet_asm.afg... [0.595490] Printing unused reads into addsv.tmp/chr22_33871043_33874754_DEL.df07c15a/UnusedReads.fa [0.601631] Estimated Coverage = 14.542781 [0.601652] Estimated Coverage cutoff = 7.271390 Final graph has 8 nodes and n50 of 1551, max 1662, total 4868, using 1091/1096 reads INFO 2021-02-09 15:23:14,640 chr22_33871043_33874754_DEL best contig length: 1692 INFO 2021-02-09 15:23:14,723 chr22_33871043_33874754_DEL alignment result: ['SUMMARY', '6415', '0', '1292', '2708', '4000'] INFO 2021-02-09 15:23:14,726 chr22_33871043_33874754_DEL trimmed contig length: 1292 INFO 2021-02-09 15:23:14,728 chr22_33871043_33874754_DEL start: 33870898, end: 33874898, tgtstart: 2708, tgtend: 4000, refstart: 33873606, refend: 33874898 INFO 2021-02-09 15:23:14,729 chr22_33871043_33874754_DEL action: DEL 0.75 DEL INFO 2021-02-09 15:23:14,732 chr22_33871043_33874754_DEL final VAF accounting for copy number 1.000000: 0.750000 WARNING 2021-02-09 15:23:14,734 chr22_33871043_33874754_DEL contig does not cover user start INFO 2021-02-09 15:23:14,735 chr22_33871043_33874754_DEL set paired end mean distance: 300.000000 INFO 2021-02-09 15:23:14,737 chr22_33871043_33874754_DEL set paired end distance stddev: 70.000000 INFO 2021-02-09 15:23:14,746 chr22_33871043_33874754_DEL paired reads: 134 INFO 2021-02-09 15:23:14,748 chr22_33871043_33874754_DEL single reads: 35 INFO 2021-02-09 15:23:14,750 chr22_33871043_33874754_DEL discard reads: 0 INFO 2021-02-09 15:23:14,752 chr22_33871043_33874754_DEL total reads: 303 INFO 2021-02-09 15:23:14,754 chr22_33871043_33874754_DEL old ctg len: 1292 INFO 2021-02-09 15:23:14,756 chr22_33871043_33874754_DEL new ctg len: 200 INFO 2021-02-09 15:23:14,758 chr22_33871043_33874754_DEL adj. factor: 0.154799 INFO 2021-02-09 15:23:14,759 chr22_33871043_33874754_DEL num. sim. reads: 17 INFO 2021-02-09 15:23:14,760 chr22_33871043_33874754_DEL PE mean outer distance: 300.000000 INFO 2021-02-09 15:23:14,762 chr22_33871043_33874754_DEL PE outer distance SD: 70.000000 INFO 2021-02-09 15:23:14,763 chr22_33871043_33874754_DEL rerror rate: 0.000000 INFO 2021-02-09 15:23:14,766 ['wgsim', '-e', '0.0', '-d', '300.0', '-s', '70.0', '-N', '17', '-1', '250', '-2', '250', '-r', '0', '-R', '0', 'addsv.tmp/chr22_33871043_33874754_DEL.wgsimtmp.9052dfd2-9255-431b-899f-47f40fd712ed.fasta', 'addsv.tmp/chr22_33871043_33874754_DEL.wgsimtmp.9052dfd2-9255-431b-899f-47f40fd712ed.1.fq', 'addsv.tmp/chr22_33871043_33874754_DEL.wgsimtmp.9052dfd2-9255-431b-899f-47f40fd712ed.2.fq'] [wgsim] seed = 1612884194 [wgsim_core] calculating the total length of the reference sequence... [wgsim_core] 1 sequences, total length: 200 [wgsim_core] skip sequence 'target' as it is shorter than 510! INFO 2021-02-09 15:23:14,836 chr22_33871043_33874754_DEL aligning addsv.tmp/chr22_33871043_33874754_DEL.wgsimtmp.9052dfd2-9255-431b-899f-47f40fd712ed.1.fq,addsv.tmp/chr22_33871043_33874754_DEL.wgsimtmp.9052dfd2-9255-431b-899f-47f40fd712ed.2.fq with bwa mem [E::bwa_idx_load_from_disk] fail to locate the index files INFO 2021-02-09 15:23:14,887 chr22_33871043_33874754_DEL writing bwatmp.7875ea41-f81d-41b7-bf1b-ea54dfe6490a.sam to BAM... [main_samview] fail to read the header from "bwatmp.7875ea41-f81d-41b7-bf1b-ea54dfe6490a.sam". INFO 2021-02-09 15:23:14,926 chr22_33871043_33874754_DEL deleting SAM: bwatmp.7875ea41-f81d-41b7-bf1b-ea54dfe6490a.sam INFO 2021-02-09 15:23:14,930 chr22_33871043_33874754_DEL sorting output: samtools sort -@ 1 -T bwatmp.7875ea41-f81d-41b7-bf1b-ea54dfe6490a.sorted.bam -o bwatmp.7875ea41-f81d-41b7-bf1b-ea54dfe6490a.sorted.bam addsv.tmp/chr22_33871043_33874754_DEL.7b6f5b07-0251-4666-b0ff-a63998cff574.muts.bam [E::hts_open_format] Failed to open file "addsv.tmp/chr22_33871043_33874754_DEL.7b6f5b07-0251-4666-b0ff-a63998cff574.muts.bam" : No such file or directory samtools sort: can't open "addsv.tmp/chr22_33871043_33874754_DEL.7b6f5b07-0251-4666-b0ff-a63998cff574.muts.bam": No such file or directory INFO 2021-02-09 15:23:14,976 chr22_33871043_33874754_DEL remove original bam:addsv.tmp/chr22_33871043_33874754_DEL.7b6f5b07-0251-4666-b0ff-a63998cff574.muts.bam multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, **kwds)) File "../bamsurgeon/bin/addsv.py", line 945, in makemut outreads = aligners.remap_fastq(args.aligner, fq1, fq2, args.refFasta, outbam_mutsfile, alignopts, mutid=mutid, threads=int(args.alignerthreads)) File "/usr/local/lib/python3.8/dist-packages/bamsurgeon-1.2-py3.8.egg/bamsurgeon/aligners.py", line 718, in remap_fastq return remap_bwamem_fastq(fq1, fq2, threads, fastaref, outbam, deltmp=deltmp, mutid=mutid) File "/usr/local/lib/python3.8/dist-packages/bamsurgeon-1.2-py3.8.egg/bamsurgeon/aligners.py", line 757, in remap_bwamem_fastq os.remove(outbam) FileNotFoundError: [Errno 2] No such file or directory: 'addsv.tmp/chr22_33871043_33874754_DEL.7b6f5b07-0251-4666-b0ff-a63998cff574.muts.bam' """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "../bamsurgeon/bin/addsv.py", line 1360, in main(args) File "../bamsurgeon/bin/addsv.py", line 1121, in main tmpbam, exclfn, mutinfo = result.get() File "/usr/lib/python3.8/multiprocessing/pool.py", line 771, in get raise self._value FileNotFoundError: [Errno 2] No such file or directory: 'addsv.tmp/chr22_33871043_33874754_DEL.7b6f5b07-0251-4666-b0ff-a63998cff574.muts.bam'`

If anyone has an idea what is wrong, I would be very happy if you could help me with this.

adamewing commented 3 years ago

Hi there,

Are you able to run the python setup.py install command successfully or does it complain about missing dependencies?

Also, are you able to run test_sv.sh from the test/ directory?

fre335 commented 3 years ago

Hi Adam, Thanks for your reply. I am running the script in the Docker image and I am successfully able to run the test script: test_sv.sh. So it shouldn't be a problem with the dependencies.