Nextomics / NextPolish

Fast and accurately polish the genome generated by long reads.
GNU General Public License v3.0
213 stars 28 forks source link

No seq_split in new download #91

Closed Amarea1 closed 2 years ago

Amarea1 commented 2 years ago

Question or Expected behavior I have attempted to update my copy of NextPolish from 1.3 to the newest edition. On downloading the file from github, the test data has run fine but when I run my own data I get an error saying that there is no seq_split file. Re-downloading does not fix the issue. It also appears to be generating a backup file containing only the "00.score_chain" directory as soon as it gives me the Traceback.

Operating system Ubuntu 18.04 server

GCC GCC v. 7.5.0

Python 3.8.12

NextPolish 1.4.0

/home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.e hostname

I'm not the most proficient person with Linux or software like this so hopefully it's an easy issue to fix. Thanks very much!

moold commented 2 years ago

Hi, you need to compile after downloading, then you will find seq_split in the bin directory.

pip install paralleltask
tar -vxzf NextPolish.tgz && cd NextPolish && make
Amarea1 commented 2 years ago

Hi again

I tried re-installing from scratch and made sure to use both of the commands you said but now it's giving a different issue

Output:

[20784 INFO] 2022-04-27 12:21:13 NextPolish start... [20784 INFO] 2022-04-27 12:21:13 version:v1.4.0 logfile:pid20784.log.info [20784 WARNING] 2022-04-27 12:21:13 Delete task: 5 due to missing lgs_fofn. [20784 WARNING] 2022-04-27 12:21:13 Delete task: 5 due to missing lgs_fofn. [20784 WARNING] 2022-04-27 12:21:13 Delete task: 6 due to missing hifi_fofn. [20784 WARNING] 2022-04-27 12:21:13 Delete task: 6 due to missing hifi_fofn. [20784 INFO] 2022-04-27 12:21:13 scheduled tasks: [1, 2, 1, 2] [20784 INFO] 2022-04-27 12:21:13 options: [20784 INFO] 2022-04-27 12:21:13 rerun: 3 rewrite: 0 kill: None cleantmp: 0 use_drmaa: 0 submit: None job_type: local sgs_unpaired: 0 sgs_rm_nread: 1 lgs_read_type:
parallel_jobs: 6 align_threads: 5 check_alive: None task: [1, 2, 1, 2] job_id_regex: None sgs_max_depth: 100 lgs_max_depth: 100 multithread_jobs: 5 genome_size: 415305 lgs_max_read_len: 0 hifi_max_depth: 100 lgs_block_size: 500M lgs_min_read_len: 1k hifi_max_read_len: 0 polish_options: -p 5 hifi_block_size: 500M hifi_min_read_len: 1k job_prefix: nextPolish sgs_block_size: 6921750.0 sgs_use_duplicate_reads: 0 lgs_minimap2_options: -x map-ont hifi_minimap2_options: -x map-pb workdir: /home/stxsk44/NextPolish sgs_align_options: bwa mem -p -t 5 sgs_fofn: /home/stxsk44/NextPolish/sgs.fofn snp_phase: /home/stxsk44/NextPolish/%02d.snp_phase snp_valid: /home/stxsk44/NextPolish/%02d.snp_valid genome: /home/stxsk44/NextPolish/L14W.contigs.fasta lgs_polish: /home/stxsk44/NextPolish/%02d.lgs_polish kmer_count: /home/stxsk44/NextPolish/%02d.kmer_count hifi_polish: /home/stxsk44/NextPolish/%02d.hifi_polish score_chain: /home/stxsk44/NextPolish/%02d.score_chain [20784 WARNING] 2022-04-27 12:21:13 mv /home/stxsk44/NextPolish to /home/stxsk44/NextPolish.backup1 [20784 INFO] 2022-04-27 12:21:13 step 0 and task 1 start: [20784 INFO] 2022-04-27 12:21:18 Total jobs: 3 [20784 INFO] 2022-04-27 12:21:18 Submitted jobID:[20793] jobCmd:[/home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh] in the local_cycle. [20793 CRITICAL] 2022-04-27 12:21:18 Command '/bin/sh /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh > /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.o 2> /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.e' returned non-zero exit status 127, error info: . Traceback (most recent call last): File "./nextPolish", line 515, in main(args) File "./nextPolish", line 369, in main task.run.start() File "/home/stxsk44/miniconda3/lib/python3.8/site-packages/paralleltask/task_control.py", line 344, in start self._start() File "/home/stxsk44/miniconda3/lib/python3.8/site-packages/paralleltask/task_control.py", line 368, in _start self.submit(job) File "/home/stxsk44/miniconda3/lib/python3.8/site-packages/paralleltask/taskcontrol.py", line 252, in submit , stdout, _ = self.run(job.cmd) File "/home/stxsk44/miniconda3/lib/python3.8/site-packages/paralleltask/task_control.py", line 288, in run log.critical("Command '%s' returned non-zero exit status %d, error info: %s." % (cmd, p.returncode, stderr)) File "/home/stxsk44/miniconda3/lib/python3.8/logging/init.py", line 1493, in critical self._log(CRITICAL, msg, args, kwargs) File "/home/stxsk44/miniconda3/lib/python3.8/logging/init.py", line 1589, in _log self.handle(record) File "/home/stxsk44/miniconda3/lib/python3.8/logging/init.py", line 1599, in handle self.callHandlers(record) File "/home/stxsk44/miniconda3/lib/python3.8/logging/init.py", line 1661, in callHandlers hdlr.handle(record) File "/home/stxsk44/miniconda3/lib/python3.8/logging/init.py", line 954, in handle self.emit(record) File "/home/stxsk44/miniconda3/lib/python3.8/site-packages/paralleltask/kit.py", line 42, in emit raise Exception(record.msg) Exception: Command '/bin/sh /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh > /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.o 2> /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.e' returned non-zero exit status 127, error info: . [20784 INFO] 2022-04-27 12:21:19 Submitted jobID:[20803] jobCmd:[/home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh] in the local_cycle. [20803 CRITICAL] 2022-04-27 12:21:19 Command '/bin/sh /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh > /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh.o 2> /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh.e' returned non-zero exit status 127, error info: . Traceback (most recent call last): File "./nextPolish", line 515, in main(args) File "./nextPolish", line 369, in main task.run.start() File "/home/stxsk44/miniconda3/lib/python3.8/site-packages/paralleltask/task_control.py", line 344, in start self._start() File "/home/stxsk44/miniconda3/lib/python3.8/site-packages/paralleltask/task_control.py", line 368, in _start self.submit(job) File "/home/stxsk44/miniconda3/lib/python3.8/site-packages/paralleltask/taskcontrol.py", line 252, in submit , stdout, _ = self.run(job.cmd) File "/home/stxsk44/miniconda3/lib/python3.8/site-packages/paralleltask/task_control.py", line 288, in run log.critical("Command '%s' returned non-zero exit status %d, error info: %s." % (cmd, p.returncode, stderr)) File "/home/stxsk44/miniconda3/lib/python3.8/logging/init.py", line 1493, in critical self._log(CRITICAL, msg, args, kwargs) File "/home/stxsk44/miniconda3/lib/python3.8/logging/init.py", line 1589, in _log self.handle(record) File "/home/stxsk44/miniconda3/lib/python3.8/logging/init.py", line 1599, in handle self.callHandlers(record) File "/home/stxsk44/miniconda3/lib/python3.8/logging/init.py", line 1661, in callHandlers hdlr.handle(record) File "/home/stxsk44/miniconda3/lib/python3.8/logging/init.py", line 954, in handle self.emit(record) File "/home/stxsk44/miniconda3/lib/python3.8/site-packages/paralleltask/kit.py", line 42, in emit raise Exception(record.msg) Exception: Command '/bin/sh /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh > /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh.o 2> /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh.e' returned non-zero exit status 127, error info: . [20784 INFO] 2022-04-27 12:21:19 Submitted jobID:[20809] jobCmd:[/home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh] in the local_cycle. [20809 CRITICAL] 2022-04-27 12:21:19 Command '/bin/sh /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh > /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh.o 2> /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh.e' returned non-zero exit status 127, error info: . Traceback (most recent call last): File "./nextPolish", line 515, in main(args) File "./nextPolish", line 369, in main task.run.start() File "/home/stxsk44/miniconda3/lib/python3.8/site-packages/paralleltask/task_control.py", line 344, in start self._start() File "/home/stxsk44/miniconda3/lib/python3.8/site-packages/paralleltask/task_control.py", line 368, in _start self.submit(job) File "/home/stxsk44/miniconda3/lib/python3.8/site-packages/paralleltask/taskcontrol.py", line 252, in submit , stdout, _ = self.run(job.cmd) File "/home/stxsk44/miniconda3/lib/python3.8/site-packages/paralleltask/task_control.py", line 288, in run log.critical("Command '%s' returned non-zero exit status %d, error info: %s." % (cmd, p.returncode, stderr)) File "/home/stxsk44/miniconda3/lib/python3.8/logging/init.py", line 1493, in critical self._log(CRITICAL, msg, args, **kwargs) File "/home/stxsk44/miniconda3/lib/python3.8/logging/init.py", line 1589, in _log self.handle(record) File "/home/stxsk44/miniconda3/lib/python3.8/logging/init.py", line 1599, in handle self.callHandlers(record) File "/home/stxsk44/miniconda3/lib/python3.8/logging/init.py", line 1661, in callHandlers hdlr.handle(record) File "/home/stxsk44/miniconda3/lib/python3.8/logging/init.py", line 954, in handle self.emit(record) File "/home/stxsk44/miniconda3/lib/python3.8/site-packages/paralleltask/kit.py", line 42, in emit raise Exception(record.msg) Exception: Command '/bin/sh /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh > /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh.o 2> /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh.e' returned non-zero exit status 127, error info: . [20784 ERROR] 2022-04-27 12:21:26 db_split failed: please check the following logs: [20784 ERROR] 2022-04-27 12:21:26 /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split1/nextPolish.sh.e [20784 ERROR] 2022-04-27 12:21:26 /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split2/nextPolish.sh.e [20784 ERROR] 2022-04-27 12:21:26 /home/stxsk44/NextPolish/00.score_chain/01.db_split.sh.work/db_split3/nextPolish.sh.e

Error file:

hostname

I've checked and samtools is listed in the bin folder of the backup folder that was created. Any thoughts? Thanks very much.

moold commented 2 years ago

First, if you are sure the executable files such as samtools in the bin directory can be run normally, you can delete the working directory /home/stxsk44/NextPolish and try running NextPolish again. If you still get an error, the easiest way is to follow here to use your own alignment pipeline.

Amarea1 commented 2 years ago

Just to update, it looks like my issue may have been the same as this one:

https://github.com/Nextomics/NextPolish/issues/93

It was resolved by adding "rewrite = no" and a specific work directory to the run.cfg file. It seems to have run fine with those two lines added.

moold commented 2 years ago

OK