blachlylab / fade

Fragmentase Artifact Detection and Elimination
MIT License
11 stars 3 forks source link

fade out -c fails or produces truncated bam #28

Closed rhalperin closed 2 years ago

rhalperin commented 2 years ago

I recently ran fade out -c on a set of 28 bams. In all cases, I see following warning printed to stderr many times:

[E::bam_write1] Positional data is too large for BAM format

For 16 of the bams, it completes successfully and appears to produce a complete bam. For seven of the bams, it prints the summary statistics, for example:

read count: 265354955
Clipped %:  0.313838
% With Supplementary alns:  0.00732937
Artifact rate:  0.00724173
% With Supplementary alns and artifacts:    0.00151888
Artifact rate left only:    0.00361884
Artifact rate right only:   0.00362289

but then gives the error:

Failed to flush stdout: Bad file descriptor

For the others, it completes with exit status 0, but samtools sort gives:

[E::bgzf_read_block] Invalid BGZF header at offset 27328607662
[E::bgzf_read] Read block operation failed with error 6 after 0 of 4 bytes
samtools sort: truncated file. Aborting

fade out without the -c option completed successfully on all 28 bams without the warning or error. I don't see any of the issues with fade out -c when running on one of these bam subset to a region, so I am not able to share data here to reproduce the issue.

charlesgregory commented 2 years ago

Honestly, I wrote this section of fade a long time ago and I am certain it has bugs. I just rewrote this section now and it seems to run on my test data and I am able to use samtools sort on it. I will push a new release with the fixes and hopefully that will address the issue.

charlesgregory commented 2 years ago

Honestly, I wrote this section of fade a long time ago

Referring to fade out using -c flag. This will hard clip the artifacts out of the reads, and it is a somewhat tricky operation to trim the cigar and read correctly.

charlesgregory commented 2 years ago

I have a new version out v0.6.0 could you see if that fixes the issue?

rhalperin commented 2 years ago

Yes, v0.6.0 appears to fix the issue, thanks!