Closed unode closed 5 years ago
While trying to workaround the problem above I ran into a different issue but the error above may be related to the same problem.
Using:
ngless "0.8"
import "parallel" version "0.6"
import "mocat" version "0.0"
import "samtools" version "0.0"
samples = readlines(ARGV[3])
sample = lock1([ARGV[1]])
input = load_mocat_sample(ARGV[2] + '/' + sample)
mapped = map(input, fafile=ARGV[4], mode_all=True)
mapped = select(mapped) using |mr|:
mr = mr.filter(min_match_size=45, min_identity_pc=97, action={unmatch})
mapped_non_unique = select(mapped, keep_if=[{mapped}])
# workaround
write(mapped_non_unique, ofile='outputs/' + sample + '.test.bam')
mapped_non_unique = samfile('outputs/' + sample + '.test.bam')
# workaround
mapped_non_unique = samtools_sort(mapped_non_unique, by={name})
write(mapped_non_unique, ofile='outputs/' + sample + '.namesorted.bam')
collect(qcstats({fastq}), ofile='outputs/fqstats.txt', current=sample, allneeded=samples)
collect(qcstats({mapping}), ofile='outputs/mapstats.txt', current=sample, allneeded=samples)
I got:
...
[Mon 29-10-2018 03:03:45] Line 10: Finished mapping to db.fna
[Mon 29-10-2018 03:03:45] Line 10: Finished mapping to db.fna
[Mon 29-10-2018 03:03:45] Line 10: Total reads: 95047
[Mon 29-10-2018 03:03:45] Line 10: Total reads aligned: 54160 [56.98%]
[Mon 29-10-2018 03:03:45] Line 10: Total reads Unique map: 38849 [40.87%]
[Mon 29-10-2018 03:03:45] Line 10: Total reads Non-Unique map: 15311 [16.11%]
[Mon 29-10-2018 03:03:45] Line 11: Running garbage collection.
[Mon 29-10-2018 03:03:45] Line 11: Interpreting [interpretIO]: mapped = select(Lookup 'mapped' as NGLMappedReadSet)using {Block {blockVariable = [Variable "mr"], blockBody = Sequence [mr = (Lookup 'mr' as NGLMappedRead).MethodName {unwrapMethodName = "filter"}( Nothing; min_match_size=45; min_identity_pc=97; action={unmatch} )]}}
[Mon 29-10-2018 03:03:45] Line 11: Interpreting [assignment]: select(Lookup 'mapped' as NGLMappedReadSet)using {Block {blockVariable = [Variable "mr"], blockBody = Sequence [mr = (Lookup 'mr' as NGLMappedRead).MethodName {unwrapMethodName = "filter"}( Nothing; min_match_size=45; min_identity_pc=97; action={unmatch} )]}}
[Mon 29-10-2018 03:03:45] Line 11: Executing blocked select on file /tmp/23095980/mapped_db.27976-0.sam
[Mon 29-10-2018 03:03:45] Line 11: Created & opened temporary file /tmp/23095980/block_selected_mapped_db27976-1.sam
[Mon 29-10-2018 03:03:46] Line 14: Running garbage collection.
[Mon 29-10-2018 03:03:46] Line 14: Removing temporary file: /tmp/23095980/mapped_db.27976-0.sam
[Mon 29-10-2018 03:03:46] Line 14: Interpreting [interpretIO]: mapped_non_unique = select(Lookup 'mapped' as NGLMappedReadSet; keep_if=[{mapped}])
[Mon 29-10-2018 03:03:46] Line 14: Interpreting [assignment]: select(Lookup 'mapped' as NGLMappedReadSet; keep_if=[{mapped}])
[Mon 29-10-2018 03:03:46] Line 17: Running garbage collection.
[Mon 29-10-2018 03:03:46] Line 17: Removing temporary file: /tmp/23095980/mapped_db.27976-0.sam
[Mon 29-10-2018 03:03:46] Line 17: Interpreting [interpretIO]: write(Lookup 'mapped_non_unique' as NGLMappedReadSet; __hash="96241298d14e84a92cabbadc8bc7d00a"; ofile="outputs/"BOpAddLookup 'sample' as NGLStringBOpAdd".test.bam")
[Mon 29-10-2018 03:03:46] Line 17: Interpreting [write]: NGOMappedReadSet {nglgroupName = "data/sample", nglSamFile = <STREAM>, nglReference = Nothing}
[Mon 29-10-2018 03:03:46] Line 17: Created & opened temporary file /tmp/23095980/selected_block_selected_mapped_db27976-1streamed_.27976-2.sam
[Mon 29-10-2018 03:03:46] Line 17: Finished mapping to select.lno14
[Mon 29-10-2018 03:03:46] Line 17: Total reads: 35647
[Mon 29-10-2018 03:03:46] Line 17: Total reads aligned: 35647 [100.00%]
[Mon 29-10-2018 03:03:46] Line 17: Total reads Unique map: 27792 [77.96%]
[Mon 29-10-2018 03:03:46] Line 17: Total reads Non-Unique map: 7855 [22.04%]
Exiting after fatal error while loading and running script
System Error
Failure on converting sam to bam1
[Mon 29-10-2018 03:03:46] Line 17: Created & opened temporary file /tmp/23095980/converted_selected_block_selected_mapped_db27976-1streamed_27976-3.bam
[Mon 29-10-2018 03:03:46] Line 17: SAM->BAM Conversion start ('/tmp/23095980/selected_block_selected_mapped_db27976-1streamed_.27976-2.sam' -> '/tmp/23095980/converted_selected_block_selected_mapped_db27976-1streamed_27976-3.bam')
[Mon 29-10-2018 03:03:46] Line 17: Message from samtools: [E::sam_parse1] CIGAR and query sequence are of different length
[W::sam_read1] Parse error at line 43804
[main_samview] truncated file.
It's actually not related to large BAM files at all, it's just a corner case that doesn't happen that frequently (something like 1 out of 50,000 sequences)
In the case of the broken pipe above, is there any way that ngless can still capture and display the error? It seems like the actual message is lost and the only diagnostic is the broken pipe.
Thanks for the fix.
Current master dc769d7 fails with:
when using:
and a sufficiently large
input.bam
file.I tried reproducing this with the
.bam
files we have in the repository but none was big enough to trigger the broken pipe.I managed to reproduce it locally with a 13MB bam file generated with: