cancerit / cgpPindel

Cancer Genome Project Insertion/Deletion detection pipeline based around Pindel
http://cancerit.github.io/cgpPindel/
GNU Affero General Public License v3.0
28 stars 5 forks source link

Error on BAS file generation with bam_stats #101

Closed imendes93 closed 2 years ago

imendes93 commented 2 years ago

Greetings,

I'm trying to generate the .bas from a set of bam and bam.bai files through bam_stats.c, but i keep running into the following error:

[ERROR] (./bam_stats.c: main:179 errno: Broken pipe) Error reading header from opened hts file '-'.

I checked the headers of the bam files (attached) and they all seem normal, without any strange characters.

I'm using the container quay.io/wtsicgp/cgppindel:v3.5.0. The input files are publicly available at s3://eu-west-1-example-data/nihr/testdata Thank you for your assistance

pb_tumor_header.sam.zip

pb_normal_header.sam.zip !

keiranmraine commented 2 years ago

The headers indicate that the file was generated with fq2bam. Unless other processing steps haven't been logged the data is unmapped. You need to map the data using an appropriate tool chain.

Please see https://github.com/cancerit/PCAP-core bwa_mem.pl (this will generate *.bas as part of the processing) or any other BWA based mapping tool.

imendes93 commented 2 years ago

Hello! I've tried rerun bam_stats.c with different data and I still get the same error:

[ERROR] (./bam_stats.c: main:179 errno: Broken pipe) Error reading header from opened hts file '-'.

I used as normal and tumor data the following samples (subset containing just chr1):

Normal

Tumor

Reference

As before, the headers appear normal to me. As I already have the bam files, I was trying to avoid having to remap the data.

Thank you for your help. :)

keiranmraine commented 2 years ago

You haven't provided the command you executed. Are you intentionally piping data to bam_stats?

For file based processing you need to specify the input/output filenames:

Usage: bam_stats -i file -o file [-p plots] [-r reference.fa.fai] [-h] [-v]

-i --input          File path to read in.
-o --output         File path to output.
...

I've been able to generate the BAS files with no issues using the pindel v3.5.0 container with:

$ bam_stats -i DRR260185_chr1.bam -o DRR260185_chr1.bam.bas

Can also be used to read/write via STDIN/STDOUT via:

$ cat DRR260185_chr1.bam | bam_stats > DRR260185_chr1.bam.bas

I've generated both bas files successfully, the command taking about 3-4 minutes to complete with 1 cpu (-@ 2 recommended for BAM).

imendes93 commented 2 years ago

Sorry for my delay. This was the command I've been using.

bam_stats DRR260185_chr1.bam -r Homo_sapiens_assembly38.fasta.fai -o DRR260185_chr1.bam.bas -@ 1

Maybe the error is in the reference so i'll try it without. Ty!

keiranmraine commented 2 years ago

Your command is incorrect, you require the -i flag prior to the input filename:

bam_stats -i DRR260185_chr1.bam -o DRR260185_chr1.bam.bas -@ 1

(-r isn't necessary for BAM files)

imendes93 commented 2 years ago

Ohh no, you're right! Thank you so much!