stuck at 03.kmer_count, seq_split outputs FASTA file instead of FASTQ

Nextomics / NextPolish

Fast and accurately polish the genome generated by long reads.

GNU General Public License v3.0

213 stars 28 forks source link

stuck at 03.kmer_count, seq_split outputs FASTA file instead of FASTQ #75

Closed ctxchris closed 3 years ago

ctxchris commented 3 years ago

Hi,

I'm using nextPolish v1.3.1 with long and short reads for polishing. The short reads are interleaved paired-end libraries in FASTQ format. The pipeline get's stuck in 03.kmer_count. File 03.kmer_count/03.map.ref.sh.work/map_genome00/nextPolish.sh.e contains the error message: [E::main_mem] fail to open file polishing/./01_rundir/input.sgspart.000.fastq.gz. It's looking for a FASTQ file while the working directory contains FASTA files: input.sgspart.000.fasta.gz. The input data is in proper FASTQ format. In a previous project my input FASTQ files had been split into FASTQ files: input.sgspart.000.fastq.gz

Thanks, Chris

moold commented 3 years ago

Hi, could you paste the content of your config file to here?

ctxchris commented 3 years ago

[General] job_type = local job_prefix = nextPolish task = best rewrite = yes rerun = 3 parallel_jobs = 10 multithread_jobs = 10 genome = ./assembly.fasta genome_size = auto workdir = ./01_rundir polish_options = -p {multithread_jobs}

[sgs_option] sgs_fofn = ./sgs.fofn sgs_options = -max_depth 100 -bwa

[lgs_option] lgs_fofn = ./lgs.fofn lgs_options = -min_read_len 5k -max_depth 80 lgs_minimap2_options = -x map-pb -t {multithread_jobs}

moold commented 3 years ago

It seems very thing is ok, so you may check files listed in sgs.fofn and lgs.fofn that are correct, and try again.

ctxchris commented 3 years ago

I run the same configuration again, but instead of using one FASTQ file with forward and reverse reads interleaved, I used separate files. sgs.fofn1st run:

interleaved.fastq

sgs.fofn 2nd run:

forward.fastq reverse.fastq

The second run worked and input.sgspart.000.fastq.gz files were created

moold commented 3 years ago

Yes,the input pair-end files needs to be generated like this ls reads1_R1.fq reads1_R2.fq reads2_R1.fq.gz reads2_R2.fq.gz ... > sgs.fofn.