DiltheyLab / HLA-LA

Fast HLA type inference from whole-genome data
GNU General Public License v3.0
123 stars 42 forks source link

R_1.fastq and R_2.fastq are empty? #106

Open scontente opened 10 months ago

scontente commented 10 months ago

I recently successfully processed 1848 of 1857 WGS files obtained via dbGaP. Four of the files that did not process returned the error: "You didn't activate --longReads, but the two files ... (which store paired-end reads) are empty - this is weird, and I will abort. at /opt2/hla-la/1.0.3/HLA-LA/src/HLA-LA.pl line 513."

Several repeats, and a repeat with a 'fresh' download did not clear this error.

Can you give any insight into what might be wrong in these sequence files? Below is the readout text from one of the processes that failed:

Identified paths:
samtools_bin: /usr/bin/samtools
bwa_bin: /usr/bin/bwa
java_bin: /usr/bin/java
picard_sam2fastq_bin: /usr/bin/picard-tools General working directory: /home/sara_contente/HLA-LA/working
Sample-specific working directory: /home/sara_contente/HLA-LA/working/NWD365424 Using /home/sara_contente/HLA-LA/src/../graphs/PRG_MHC_GRCh38_withIMGT/knownReferences/1000G_B38.txt as reference file. Extract reads from 534 regions... Extract unmapped reads...
Merging... [bam_translate] RG tag "NWD365424_CGCTCATT_HL5GLCCXX_L001" on read "E00170:255:HL5GLCCXX:1:1101:1012:58233" encountered with no corresponding entry in header, tag lost. Unknown tags are on ly reported once per input file for each tag ID. [bam_translate] RG tag "NWD365424_CGCTCATT_HL5GLCCXX_L002" on read "E00170:255:HL5GLCCXX:2:1101:1012:48810" encountered with no corresponding entry in header, tag lost. Unknown tags are on ly reported once per input file for each tag ID. [bam_translate] RG tag "NWD365424_CGCTCATT_HL5GLCCXX_L003" on read "E00170:255:HL5GLCCXX:3:1101:991:28839" encountered with no corresponding entry in header, tag lost. Unknown tags are onl y reported once per input file for each tag ID. [bam_translate] RG tag "NWD365424_CGCTCATT_HL5GLCCXX_L004" on read "E00170:255:HL5GLCCXX:4:1101:1083:18168" encountered with no corresponding entry in header, tag lost. Unknown tags are only reported once per input file for each tag ID.
[bam_translate] RG tag "NWD365424_CGCTCATT_HL5GLCCXX_L005" on read "E00170:255:HL5GLCCXX:5:1101:991:33059" encountered with no corresponding entry in header, tag lost. Unknown tags are onl y reported once per input file for each tag ID.
[bam_translate] RG tag "NWD365424_CGCTCATT_HL5GLCCXX_L006" on read "E00170:255:HL5GLCCXX:6:1101:991:40829" encountered with no corresponding entry in header, tag lost. Unknown tags are onl y reported once per input file for each tag ID. [bam_translate] RG tag "NWD365424_CGCTCATT_HL5GLCCXX_L007" on read "E00170:255:HL5GLCCXX:7:1101:991:28031" encountered with no corresponding entry in header, tag lost. Unknown tags are onl y reported once per input file for each tag ID. [bam_translate] RG tag "NWD365424_CGCTCATT_HL5GLCCXX_L008" on read "E00170:255:HL5GLCCXX:8:1101:1002:14793" encountered with no corresponding entry in header, tag lost. Unknown tags are on ly reported once per input file for each tag ID. Indexing... Extract FASTQ... /usr/bin/picard-tools SamToFastq VALIDATION_STRINGENCY=LENIENT I=/home/sara_contente/HLA-LA/working/NWD365424/extraction.bam F=/home/sara_contente/HLA-LA/working/NWD365424/R_1.fast q F2=/home/sara_contente/HLA-LA/working/NWD365424/R_2.fastq FU=/home/sara_contente/HLA-LA/working/NWD365424/R_U.fastq 2>&1 You didn't activate --longReads, but the two files /home/sara_contente/HLA-LA/working/NWD365424/R_1.fastq and /home/sara_contente/HLA-LA/working/NWD365424/R_2.fastq (which store paired-end reads) are empty - this is weird, and I will abort. at /home/sara_contente/HLA-LA/src/HLA-LA.pl line 513.