cancerit / cgpPindel

Cancer Genome Project Insertion/Deletion detection pipeline based around Pindel
http://cancerit.github.io/cgpPindel/
GNU Affero General Public License v3.0
28 stars 5 forks source link

generated files are corrupt error #100

Closed imendes93 closed 2 years ago

imendes93 commented 2 years ago

Greetings. I'm attempting to run cgpPindel and I'm running into the following error:

STDERR: 20 generated files are corrupt: 
STDERR: /tmp/nxf-3643953189005937501/outdir/pindel/tmpPindel/pb_sample/chr14.txt.gz
.... (continues for the other chr files)
STDERR: /tmp/nxf-3643953189005937501/outdir/pindel/tmpPindel/pb_sample/chr9.txt.gz  
STDERR: /tmp/nxf-3643953189005937501/outdir/pindel/tmpPindel/pb_sample/chr20.txt.gz at /opt/wtsi-cgp/bin/pindel_input_gen.pl line 52.
STDERR: Command exited with non-zero status 255
STDERR: 864.32user 106.25system 9:02.15elapsed 179%CPU (0avgtext+0avgdata 1643492maxresident)k
STDERR: 344inputs+173920outputs (0major+25538947minor)pagefaults 0swaps
Thread 1 terminated abnormally: 
THREAD_EXITED: Wrapper script message:
"/usr/bin/time bash /tmp/nxf-3643953189005937501/outdir/pindel/tmpPindel/logs/Sanger_CGP_Pindel_Implement_input.1.sh 1> /tmp/nxf-3643953189005937501/outdir/pindel/tmpPindel/logs/Sanger_CGP_Pindel_Implement_input.1.out 2> /tmp/nxf-3643953189005937501/outdir/pindel/tmpPindel/logs/Sanger_CGP_Pindel_Implement_input.1.err" unexpectedly returned exit value 255 at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 244 thread 1.
 at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 235
Thread error: 
THREAD_EXITED: Wrapper script message:
"/usr/bin/time bash /tmp/nxf-3643953189005937501/outdir/pindel/tmpPindel/logs/Sanger_CGP_Pindel_Implement_input.1.sh 1> /tmp/nxf-3643953189005937501/outdir/pindel/tmpPindel/logs/Sanger_CGP_Pindel_Implement_input.1.out 2> /tmp/nxf-3643953189005937501/outdir/pindel/tmpPindel/logs/Sanger_CGP_Pindel_Implement_input.1.err" unexpectedly returned exit value 255 at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 244 thread 1.
 at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 235

From the log files two threads are running. Here's the command that i'm using:

pindel.pl \
    -reference Homo_sapiens_assembly38.fasta \
    -outdir outdir/pindel \
    -tumour pb_tumor.bam \
    -normal pb_normal.bam \
    -simrep simpleRepeats.bed.gz \
    -unmatched pindel_np.gff3.gz \
    -filter softRules.lst \
    -genes codingexon_regions.indel.bed.gz \
    --badloci hiSeqDepth.bed.gz \
    -seqtype WGS \
    -assembly GRCh38 \
    -species Human \
    -e "NC_007605,hs37d5,GL%" \
    -cpus $cpus

In the container quay.io/wtsicgp/cgppindel:v3.5.0. The index for the bam files, reference and badloci files are in the directory.

The input files are publicly available at s3://eu-west-1-example-data/nihr/testdata Thank you for your assistance!

keiranmraine commented 2 years ago

It looks like your input BAMs both have the same sample name in the RG header field. This can be confirmed by running the following commands and checking the value in the SM element.

samtools view -H pb_tumor.bam | grep '^@RG'
samtools view -H pb_normal.bam | grep '^@RG'

I am unable to access the files at this time.

keiranmraine commented 2 years ago

The headers attached to #101 confirm this:

# pb_normal
@RG ID:pb_sample.1  LB:lib1 PL:bar  SM:pb_sample    PU:pb_sample.1
# pb_tumour
@RG ID:pb_sample.1  LB:lib1 PL:bar  SM:pb_sample    PU:pb_sample.1