halelab / GBS-SNP-CROP

GBS SNP Calling Reference Optional Pipeline
GNU General Public License v2.0
31 stars 31 forks source link

Empty output files in STEP 5 (Tutorial data) #24

Closed angelaparodymerino closed 4 years ago

angelaparodymerino commented 5 years ago

Hi,

I moved from STEP 4 (note that I am using the Tutorial dataset and tutorial scripts), from which output files look good, to STEP 5 and although I did not get an error message or warnings I obtained empty output files. This is the run of STEP 5:

~/GBS-SNP-CROP/GBS-SNP-CROP-scripts/v.4.0$ perl GBS-SNP-CROP-5.pl -bw /usr/local/bin/bwa -st /usr/local/bin/samtools -d PE -b barcodesID.txt -ref MR.Genome.fa -Q 30 -q 30 -F 2308 -f 2 -t 10 -Opt 0

#################################
# GBS-SNP-CROP, Step 5, v.4.0
#################################

Indexing reference FASTA file ...
[bwa_index] Pack FASTA... 0.04 sec
[bwa_index] Construct BWT for the packed sequence...
[BWTIncCreate] textLength=2587850, availableWord=3287896
[bwt_gen] Finished constructing BWT in 5 iterations.
[bwa_index] 0.35 seconds elapse.
[bwa_index] Update BWT... 0.01 sec
[bwa_index] Pack forward-only FASTA... 0.03 sec
[bwa_index] Construct SA from BWT and Occ... 0.09 sec
[main] Version: 0.7.15-r1140
[main] CMD: bwa index -a bwtsw MR.Genome.fa
[main] Real time: 0.578 sec; CPU: 0.512 sec
DONE.

Mapping paired Lib1_01.R1.fq.gz Lib1_01.R2.fq.gz files to MR.Genome.fa ...
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[E::main_mem] fail to open file `Lib1_01.R1.fq.gz'.
Mapping paired Lib1_02.R1.fq.gz Lib1_02.R2.fq.gz files to MR.Genome.fa ...
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[E::main_mem] fail to open file `Lib1_02.R1.fq.gz'.
Mapping paired Lib1_03.R1.fq.gz Lib1_03.R2.fq.gz files to MR.Genome.fa ...
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[E::main_mem] fail to open file `Lib1_03.R1.fq.gz'.
Mapping paired Lib1_04.R1.fq.gz Lib1_04.R2.fq.gz files to MR.Genome.fa ...
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[E::main_mem] fail to open file `Lib1_04.R1.fq.gz'.
Mapping paired Lib1_05.R1.fq.gz Lib1_05.R2.fq.gz files to MR.Genome.fa ...
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[E::main_mem] fail to open file `Lib1_05.R1.fq.gz'.
DONE.

Processing the SAM files ...
DONE.

Sorting the BAM files ...
DONE.

Indexing the sorted BAM files ...
DONE.

Indexing the reference genome FASTA file ...
DONE.

Producing the mpileup files ...
[mpileup] 1 samples in 1 input files
[mpileup] 1 samples in 1 input files
[mpileup] 1 samples in 1 input files
[mpileup] 1 samples in 1 input files
[mpileup] 1 samples in 1 input files
DONE.

Elapsed time: 0.08 min
Please cite: Melo et al. (2016) GBS-SNP-CROP: A reference-optional pipeline for
SNP discovery and plant germplasm characterization using variable length, paired-end
genotyping-by-sequencing data. BMC Bioinformatics. DOI 10.1186/s12859-016-0879-y.

This is the content of the folder after running the Script 5 (in bold= folders):

alignments <-- NEW barcodesID.txt demultiplexed distribs FastaCQonLibraries FastaForRef GBS-SNP-CROP-1.pl GBS-SNP-CROP-2.pl GBS-SNP-CROP-3.pl GBS-SNP-CROP-4.pl GBS-SNP-CROP-5.pl GBS-SNP-CROP-6.pl GBS-SNP-CROP-7.pl GBS-SNP-CROP-8.pl GBS-SNP-CROP-9.pl Lib1_01.mpileup <-- NEW (empty) Lib1_01.R1.fastq Lib1_01.R2.fastq Lib1_02.mpileup<-- NEW (empty) Lib1_02.R1.fastq Lib1_02.R2.fastq Lib1_03.mpileup<-- NEW (empty) Lib1_03.R1.fastq Lib1_03.R2.fastq Lib1_04.mpileup<-- NEW (empty) Lib1_04.R1.fastq Lib1_04.R2.fastq Lib1_05.mpileup<-- NEW (empty) Lib1_05.R1.fastq Lib1_05.R2.fastq MR.Clusters.fa (1.2 MB) MR.Genome.fa (1.3 MB) MR.Genome.fa.amb<-- NEW (2.2 MB) MR.Genome.fa.ann<-- NEW (53 Bytes) MR.Genome.fa.bwt<-- NEW (41 Bytes) MR.Genome.fa.fai<-- NEW (13 MB) MR.Genome.fa.pac<-- NEW (323 KB) MR.Genome.fa.sa<-- NEW (647 KB) parsed Pear.log PosToMask.txt singles summaries variants

In the folder "alignments" <-- NEW there are these files but they look empty:

Lib1_01.bam Lib1_01.sam Lib1_01.sorted.bam Lib1_01.sorted.bam.bai Lib1_02.bam Lib1_02.sam Lib1_02.sorted.bam Lib1_02.sorted.bam.bai Lib1_03.bam Lib1_03.sam Lib1_03.sorted.bam Lib1_03.sorted.bam.bai Lib1_04.bam Lib1_04.sam Lib1_04.sorted.bam Lib1_04.sorted.bam.bai Lib1_05.bam Lib1_05.sam Lib1_05.sorted.bam Lib1_05.sorted.bam.bai

Could someone help me to solve this?

I don't understand what those output files with name MR.Genome with extensions .fa.amb, .fa.ann, fa.bwt, fa.fai, fa.pac, fa.sa are for?

Thanks in advance,

'Angela Parody Merino

halelab commented 4 years ago

Please see newly released v.4.1 with updated User Manual, and thanks for flagging the bugs. iago