Nextomics / NextPolish

Fast and accurately polish the genome generated by long reads.
GNU General Public License v3.0
205 stars 28 forks source link

Unable to run nextpolish program #88

Open amit4mchiba opened 2 years ago

amit4mchiba commented 2 years ago

Hi, I am writing here to request your help to run Nextpolish. I have long read (pacbio sequel 1) and illumina reads, and I wanted to polish the assembly. After installing the software, I was able to run the test, which means the installation has no problem.

Next, based on test, I created config file and run the program, and got following error- [154451 INFO] 2022-03-02 02:47:30 Submitted jobID:[154452] jobCmd:[/mnt/md1/Gg_GI_polishing/Sf_polishing/Sf_Nextpolish_rundir3/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh] in the local_cycle. [154452 CRITICAL] 2022-03-02 02:47:30 Command '/bin/sh /mnt/md1/Gg_GI_polishing/Sf_polishing/Sf_Nextpolish_rundir3/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh > /mnt/md1/Gg_GI_polishing/Sf_polishing/Sf_Nextpolish_rundir3/00.lgs_polish/01.db_split.sh.work/db _split1/nextPolish.sh.o 2> /mnt/md1/Gg_GI_polishing/Sf_polishing/Sf_Nextpolish_rundir3/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.e' returned non-zero exit status 1, error info: . Traceback (most recent call last): File "/mnt/HD1/NextPolish/nextPolish", line 515, in main(args) File "/mnt/HD1/NextPolish/nextPolish", line 369, in main task.run.start() File "/home/amit8chiba/miniconda2/lib/python2.7/site-packages/paralleltask/task_control.py", line 347, in start self._start() File "/home/amit8chiba/miniconda2/lib/python2.7/site-packages/paralleltask/task_control.py", line 371, in _start self.submit(job) File "/home/amit8chiba/miniconda2/lib/python2.7/site-packages/paralleltask/taskcontrol.py", line 255, in submit , stdout, _ = self.run(job.cmd) File "/home/amit8chiba/miniconda2/lib/python2.7/site-packages/paralleltask/task_control.py", line 291, in run log.critical("Command '%s' returned non-zero exit status %d, error info: %s." % (cmd, p.returncode, stderr)) File "/home/amit8chiba/miniconda2/lib/python2.7/logging/init.py", line 1212, in critical self._log(CRITICAL, msg, args, **kwargs) File "/home/amit8chiba/miniconda2/lib/python2.7/logging/init.py", line 1286, in _log self.handle(record) File "/home/amit8chiba/miniconda2/lib/python2.7/logging/init.py", line 1296, in handle self.callHandlers(record) File "/home/amit8chiba/miniconda2/lib/python2.7/logging/init.py", line 1336, in callHandlers hdlr.handle(record) File "/home/amit8chiba/miniconda2/lib/python2.7/logging/init.py", line 759, in handle self.emit(record) File "/home/amit8chiba/miniconda2/lib/python2.7/site-packages/paralleltask/kit.py", line 42, in emit raise Exception(record.msg) Exception: Command '/bin/sh /mnt/md1/Gg_GI_polishing/Sf_polishing/Sf_Nextpolish_rundir3/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh > /mnt/md1/Gg_GI_polishing/Sf_polishing/Sf_Nextpolish_rundir3/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.o 2> /mnt/md1/Gg_GI_polishing/Sf_polishing/Sf_Nextpolish_rundir3/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.e' returned non-zero exit status 1, error info: . [154451 INFO] 2022-03-02 02:47:30 Submitted jobID:[154458] jobCmd:[/mnt/md1/Gg_GI_polishing/Sf_polishing/Sf_Nextpolish_rundir3/00.lgs_polish/01.db_split.sh.work/db_split2/nextPolish.sh] in the local_cycle. [154451 INFO] 2022-03-02 02:47:31 Submitted jobID:[154474] jobCmd:[/mnt/md1/Gg_GI_polishing/Sf_polishing/Sf_Nextpolish_rundir3/00.lgs_polish/01.db_split.sh.work/db_split3/nextPolish.sh] in the local_cycle.

What does that mean and how to solve the error. I will be grateful for your advise. Please find below asked information and please let me know if further information is need to help me.

Thank you so much in advance, Amit

Error message [153452 INFO] 2022-03-02 02:06:54 NextPolish start... [153452 INFO] 2022-03-02 02:06:54 version:v1.4.0 logfile:pid153452.log.info [153452 WARNING] 2022-03-02 02:06:54 Re-write workdir [153452 WARNING] 2022-03-02 02:06:55 Delete task: 6 due to missing hifi_fofn. [153452 WARNING] 2022-03-02 02:06:55 Delete task: 6 due to missing hifi_fofn. [153452 INFO] 2022-03-02 02:06:55 scheduled tasks: [5, 5, 1, 2, 1, 2] [153452 INFO] 2022-03-02 02:06:55 options: [153452 INFO] 2022-03-02 02:06:55 rerun: 3 kill: None rewrite: 1 cleantmp: 0 use_drmaa: 0 submit: None job_type: local sgs_unpaired: 0 sgs_rm_nread: 1 parallel_jobs: 12 align_threads: 48 check_alive: None job_id_regex: None lgs_max_depth: 100 lgs_read_type: clr sgs_max_depth: 100 lgs_max_read_len: 0 hifi_max_depth: 100 multithread_jobs: 10 lgs_min_read_len: 1k hifi_max_read_len: 0 hifi_block_size: 500M polish_options: -p 10 hifi_min_read_len: 1k job_prefix: nextPolish genome_size: 1802711686 task: [5, 5, 1, 2, 1, 2] lgs_block_size: 500000000 sgs_block_size: 500000000 sgs_use_duplicate_reads: 0 hifi_minimap2_options: -x map-pb sgs_align_options: bwa mem -p -t 10 lgs_minimap2_options: -x map-pb -t 48 sgs_fofn: /mnt/md1/Gg_GI_polishing/Sf_polishing/./sgs.fofn lgs_fofn: /mnt/md1/Gg_GI_polishing/Sf_polishing/./lgs.fofn workdir: /mnt/md1/Gg_GI_polishing/Sf_polishing/./Sf_Nextpolish_rundir genome: /mnt/md1/Gg_GI_polishing/Sf_polishing/./Sf_tgs_Gapfilled_assembly.fasta snp_valid: /mnt/md1/Gg_GI_polishing/Sf_polishing/./Sf_Nextpolish_rundir/%02d.snp_valid snp_phase: /mnt/md1/Gg_GI_polishing/Sf_polishing/./Sf_Nextpolish_rundir/%02d.snp_phase lgs_polish: /mnt/md1/Gg_GI_polishing/Sf_polishing/./Sf_Nextpolish_rundir/%02d.lgs_polish kmer_count: /mnt/md1/Gg_GI_polishing/Sf_polishing/./Sf_Nextpolish_rundir/%02d.kmer_count hifi_polish: /mnt/md1/Gg_GI_polishing/Sf_polishing/./Sf_Nextpolish_rundir/%02d.hifi_polish score_chain: /mnt/md1/Gg_GI_polishing/Sf_polishing/./Sf_Nextpolish_rundir/%02d.score_chain [153452 INFO] 2022-03-02 02:06:55 step 0 and task 5 start: [153452 INFO] 2022-03-02 02:07:00 Total jobs: 3 [153452 INFO] 2022-03-02 02:07:00 Submitted jobID:[153453] jobCmd:[/mnt/md1/Gg_GI_polishing/Sf_polishing/Sf_Nextpolish_rundir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh] in the local_cycle. [153453 CRITICAL] 2022-03-02 02:07:00 Command '/bin/sh /mnt/md1/Gg_GI_polishing/Sf_polishing/Sf_Nextpolish_rundir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh > /mnt/md1/Gg_GI_polishing/Sf_polishing/Sf_Nextpolish_rundir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.o 2> /mnt/md1/Gg_GI_polishing/Sf_polishing/Sf_Nextpolish_rundir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.e' returned non-zero exit status 1, error info: . [153452 INFO] 2022-03-02 02:07:01 Submitted jobID:[153459] jobCmd:[/mnt/md1/Gg_GI_polishing/Sf_polishing/Sf_Nextpolish_rundir/00.lgs_polish/01.db_split.sh.work/db_split2/nextPolish.sh] in the local_cycle. [153452 INFO] 2022-03-02 02:07:01 Submitted jobID:[153475] jobCmd:[/mnt/md1/Gg_GI_polishing/Sf_polishing/Sf_Nextpolish_rundir/00.lgs_polish/01.db_split.sh.work/db_split3/nextPolish.sh] in the local_cycle. [153452 ERROR] 2022-03-02 02:45:28 db_split failed: please check the following logs: [153452 ERROR] 2022-03-02 02:45:28 /mnt/md1/Gg_GI_polishing/Sf_polishing/Sf_Nextpolish_rundir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.e

nextPolish.sh.e

hostname

Operating system Which operating system and version are you using? No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focal

GCC What version of GCC are you using? COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 6.4.0-17ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --prefix=/usr --with-as=/usr/bin/x86_64-linux-gnu-as --with-ld=/usr/bin/x86_64-lin ux-gnu-ld --program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror -- with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 6.4.0 20180424 (Ubuntu 6.4.0-17ubuntu1)

Python What version of Python are you using? Python 2.7.14 (Installed using conda)

NextPolish What version of NextPolish are you using? nextPolish v1.4.0

moold commented 2 years ago

As the log says /mnt/md1/Gg_GI_polishing/Sf_polishing/Soplha_reverse_paired.fastq.g does not exist!, it seems you have a typo, the filename ends with gz, not z.

amit4mchiba commented 2 years ago

Thank you so much for the reply.

I checked the files, and it had all the files. So, I still do not know why the error. But one thing that I did differently is that after trimming the Illumina reads, I concatinated forward and reverse reads into as single forward and reverse sequencing dataset, and then used it for running the program. I think that probably this dataset is too big (or bigger than expected by the program), causing into this error.

I re-run the program without concatenating fastq files and running through fofn files, and so far, it is running with out any error. Will keep you posted if any new error emerges.

I have another question. I was just testing differences if I polish my assembly using only short reads or long reads. Ofcourse, long reads run ended quickly, but this also resulted in filling all the gaps (from over 1789 assembly gaps, it ended up 0). Clearly, this is a mistake. I checked your manual and instructions, and it says that it is recommended to split the assemblies at the gaps, then polish it using nextpolish, and then scaffold again. Scaffolding does takes a lot of resources and informations. Is it possible to create a bed file with information related to gaps, then run the next polish after splitting the assemblies, and at the end, place the contigs as it was before using the bed file of original gap? Is it possible to implement here or could you suggest some program to do that.

Thank you so much in advance,

with best regards Amit

moold commented 2 years ago

Hi, NextPolish required paired-end reads should be in separate files by default, you can switch to single file input with the option -unpaired. For Is it possible to create a bed file with information related to gaps ... ?, Yes, you can do like that, but probably you have to write a small script for this task.