Nextomics / NextDenovo

Fast and accurate de novo assembler for long reads
GNU General Public License v3.0
363 stars 52 forks source link

get_cns error #7

Closed JHAO12321 closed 5 years ago

JHAO12321 commented 5 years ago

HI, I got errors when running get_cns step, the errors are listed below. The 'segmentation fault' also happened at sort_align stage, but disappeared when I rerun the wrong tasks. However, things didn't get well when I rerun the jobs at get_cns stage. Could you help me with the problems? Thank you!

ERROR 1: /home/SystemSoftware/tsce/torque6/share/nodes1/mom_priv/jobs/1270566.mu01.SC: line 5: 642 Segmentation fault (core dumped) python /1.Software/NextDenovo/lib/nextCorrector.py -f /nextdenovo/01.correct/.//02.cns_align//01.get_cns.input.idxs -i /nextdenovo/01.correct/01.raw_align/03.sort_align.sh.work/sort_align006/input.seed.088.sorted.ovl -p 20 -max_lq_length 1000 -fast -o cns.fasta

ERROR 2: Traceback (most recent call last): File "/1.Software/NextDenovo/lib/nextCorrector.py", line 258, in main(args) File "/1.Software/NextDenovo/lib/nextCorrector.py", line 198, in main worker, read_seq_data(args, corrected_seeds), chunksize=1): File "/1.Software/python_lib/nextdenovo/lib/python2.7/multiprocessing/pool.py", line 668, in next raise value SystemError: NULL result without error in PyObject_Call

ERROR 3: Error in `python': double free or corruption (out): 0x00002b70bbe79010 ======= Backtrace: ========= /lib64/libc.so.6(+0x7cfe1)[0x2b6e4bdb0fe1] /1.Software/NextDenovo/lib/ovlSeq.so(bit2seq+0x1f7)[0x2b6e54d0e867] /1.Software/python_lib/nextdenovo/lib/python2.7/lib-dynload/_ctypes.so(ffi_call_unix64+0x4c)[0x2b6e548fb35c] /1.Software/python_lib/nextdenovo/lib/python2.7/lib-dynload/_ctypes.so(ffi_call+0x1f5)[0x2b6e548faab5] /1.Software/python_lib/nextdenovo/lib/python2.7/lib-dynload/_ctypes.so(_ctypes_callproc+0x3e6)[0x2b6e548f2166] /1.Software/python_lib/nextdenovo/lib/python2.7/lib-dynload/_ctypes.so(+0x9cf3)[0x2b6e548e9cf3] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(PyObject_Call+0x53)[0x2b6e4b071d23] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x6a24)[0x2b6e4b122f54] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(+0x6dcec)[0x2b6e4b095cec] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(+0x686cd)[0x2b6e4b0906cd] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x2e1f)[0x2b6e4b11f34f] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(+0x6dcec)[0x2b6e4b095cec] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(+0x686cd)[0x2b6e4b0906cd] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x2e1f)[0x2b6e4b11f34f] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x89e)[0x2b6e4b125a2e] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(+0x794a8)[0x2b6e4b0a14a8] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(PyObject_Call+0x53)[0x2b6e4b071d23] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x6267)[0x2b6e4b122797] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x8665)[0x2b6e4b124b95] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x8665)[0x2b6e4b124b95] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x89e)[0x2b6e4b125a2e] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(+0x793a1)[0x2b6e4b0a13a1] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(PyObject_Call+0x53)[0x2b6e4b071d23] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(+0x5c4bf)[0x2b6e4b0844bf] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(PyObject_Call+0x53)[0x2b6e4b071d23] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+0x43)[0x2b6e4b11b633] /1.Software/python_lib/nextdenovo/bin/../lib/libpython2.7.so.1.0(+0x135d42)[0x2b6e4b15dd42] /lib64/libpthread.so.0(+0x7dc5)[0x2b6e4b416dc5] /lib64/libc.so.6(clone+0x6d)[0x2b6e4be2a1cd] ======= Memory map: ======== ...... ...... /home/SystemSoftware/tsce/torque6/share/nodes2/mom_priv/jobs/1270586.mu01.SC: line 5: 25161 Aborted (core dumped) python /1.Software/NextDenovo/lib/nextCorrector.py -f /nextdenovo/01.correct/.//02.cns_align//01.get_cns.input.idxs -i /nextdenovo/01.correct/01.raw_align/03.sort_align.sh.work/sort_align026/input.seed.021.sorted.ovl -p 20 -max_lq_length 1000 -fast -o cns.fasta

moold commented 5 years ago

Could you send the following two files to me by email ?

JHAO12321 commented 5 years ago

HI, Thank you for your immediate reply. I suspect that the ‘sorted.ovls’ may not have the correct structure due to some problems of the disk and trying to regenerate the files. I will rerun the tasks again with the new ovls and will send you the files if the errors still exist.

   Thank you!

Best Regards

发件人: Hu Jiang notifications@github.com 答复: Nextomics/NextDenovo reply@reply.github.com 日期: 2019年7月17日 星期三 12:28 收件人: Nextomics/NextDenovo NextDenovo@noreply.github.com 抄送: "王佳昊(Jiahao Wang)" wangjiahao@genomics.cn, Author author@noreply.github.com 主题: Re: [Nextomics/NextDenovo] get_cns error (#7)

Could you send the following two files to me by email ?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Nextomics/NextDenovo/issues/7?email_source=notifications&email_token=AMUIB4RRTFSO7KWIPVNDWZ3P72NYBA5CNFSM4IEK6OOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2C7XVY#issuecomment-512097239, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AMUIB4VJ7QE544J2VUW4NLTP72NYBANCNFSM4IEK6OOA.

moold commented 5 years ago

OK

JHAO12321 commented 5 years ago

HI, I have upload the two files, https://pan.genomics.cn/ucdisk/s/MJr2ii. Besides, I also need to know the reason of other two errors. Thank you very much!


发件人: Hu Jiang notifications@github.com 发送时间: 2019年7月17日 18:24 收件人: Nextomics/NextDenovo 抄送: 王佳昊(Jiahao Wang); Author 主题: Re: [Nextomics/NextDenovo] get_cns error (#7)

OK

― You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Nextomics/NextDenovo/issues/7?email_source=notifications&email_token=AMUIB4SUJ6REV5CA3MNIIZDP73XOLA5CNFSM4IEK6OOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2DX6RY#issuecomment-512196423, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AMUIB4R6IUPTZWKAF6JDCDTP73XOLANCNFSM4IEK6OOA.

moold commented 5 years ago

Hi, I have checked files that your uploaded and it seems this is a old version bug, I have fixed in the new version some weeks ago, so pls download the newest version and try again.

JHAO12321 commented 5 years ago

OK, I will try. Thanks!

Best Regards Jiahao wangjiahao@genomics.cn

发件人: Hu Jiang notifications@github.com 答复: Nextomics/NextDenovo reply@reply.github.com 日期: 2019年7月22日 星期一 13:38 收件人: Nextomics/NextDenovo NextDenovo@noreply.github.com 抄送: "王佳昊(Jiahao Wang)" wangjiahao@genomics.cn, Author author@noreply.github.com 主题: Re: [Nextomics/NextDenovo] get_cns error (#7)

Hi, I have checked files that your uploaded and it seems this is a old version bug, I have fixed in the new version some weeks ago, so pls download the newest version and try again.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Nextomics/NextDenovo/issues/7?email_source=notifications&email_token=AMUIB4WU62ZIQEXSVNWW7NDQAVBUHA5CNFSM4IEK6OOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2O2FNY#issuecomment-513647287, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AMUIB4UKSWGRISGQMTYXTO3QAVBUHANCNFSM4IEK6OOA.

JHAO12321 commented 5 years ago

I download the NextDenovo from github yesterday and rerun the get_cns step but the error still exists. Maybe it is caused by some other reasons. Could you please check again? Thank you very much!

Best Regards Jiahao wangjiahao@genomics.cn

发件人: "王佳昊(Jiahao Wang)" wangjiahao@genomics.cn 日期: 2019年7月22日 星期一 14:52 收件人: Nextomics/NextDenovo reply@reply.github.com 主题: Re: [Nextomics/NextDenovo] get_cns error (#7)

OK, I will try. Thanks!

Best Regards Jiahao wangjiahao@genomics.cn

发件人: Hu Jiang notifications@github.com 答复: Nextomics/NextDenovo reply@reply.github.com 日期: 2019年7月22日 星期一 13:38 收件人: Nextomics/NextDenovo NextDenovo@noreply.github.com 抄送: "王佳昊(Jiahao Wang)" wangjiahao@genomics.cn, Author author@noreply.github.com 主题: Re: [Nextomics/NextDenovo] get_cns error (#7)

Hi, I have checked files that your uploaded and it seems this is a old version bug, I have fixed in the new version some weeks ago, so pls download the newest version and try again.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Nextomics/NextDenovo/issues/7?email_source=notifications&email_token=AMUIB4WU62ZIQEXSVNWW7NDQAVBUHA5CNFSM4IEK6OOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2O2FNY#issuecomment-513647287, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AMUIB4UKSWGRISGQMTYXTO3QAVBUHANCNFSM4IEK6OOA.

JHAO12321 commented 5 years ago

HI, if it is not easy to solve the problem, could you help me to ignore the reads like this ? Thank you.

Best Regards Jiahao wangjiahao@genomics.cn

发件人: "王佳昊(Jiahao Wang)" wangjiahao@genomics.cn 日期: 2019年7月23日 星期二 09:39 收件人: Nextomics/NextDenovo reply@reply.github.com 主题: Re: [Nextomics/NextDenovo] get_cns error (#7)

I download the NextDenovo from github yesterday and rerun the get_cns step but the error still exists. Maybe it is caused by some other reasons. Could you please check again? Thank you very much!

Best Regards Jiahao wangjiahao@genomics.cn

发件人: "王佳昊(Jiahao Wang)" wangjiahao@genomics.cn 日期: 2019年7月22日 星期一 14:52 收件人: Nextomics/NextDenovo reply@reply.github.com 主题: Re: [Nextomics/NextDenovo] get_cns error (#7)

OK, I will try. Thanks!

Best Regards Jiahao wangjiahao@genomics.cn

发件人: Hu Jiang notifications@github.com 答复: Nextomics/NextDenovo reply@reply.github.com 日期: 2019年7月22日 星期一 13:38 收件人: Nextomics/NextDenovo NextDenovo@noreply.github.com 抄送: "王佳昊(Jiahao Wang)" wangjiahao@genomics.cn, Author author@noreply.github.com 主题: Re: [Nextomics/NextDenovo] get_cns error (#7)

Hi, I have checked files that your uploaded and it seems this is a old version bug, I have fixed in the new version some weeks ago, so pls download the newest version and try again.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Nextomics/NextDenovo/issues/7?email_source=notifications&email_token=AMUIB4WU62ZIQEXSVNWW7NDQAVBUHA5CNFSM4IEK6OOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2O2FNY#issuecomment-513647287, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AMUIB4UKSWGRISGQMTYXTO3QAVBUHANCNFSM4IEK6OOA.

moold commented 5 years ago

You should re-run all pipeline, not only the get_cns step.

moold commented 5 years ago

But, if you only have few tasks with this error, you can check the input.fofn for the sort task, such as: /nextdenovo/01.correct/01.raw_align/03.sort_align.sh.work/sort_align006/input.fofn, and rerurn the minimap task which produced the ovl files in input.fofn. and then re-run the corresponding sort task and cns task.

JHAO12321 commented 5 years ago

It will cost a lot of time to check whther the new version is ok. Is nextdenovo suitable for large dataset of pacbio?I know that small dataset is ok, but I haven't seen people to deal with large dataset of pb use nextdenovo.

获取 Outlook for Androidhttps://aka.ms/ghei36


From: Hu Jiang notifications@github.com Sent: Wednesday, July 24, 2019 9:33:22 AM To: Nextomics/NextDenovo NextDenovo@noreply.github.com Cc: 王佳昊(Jiahao Wang) wangjiahao@genomics.cn; Author author@noreply.github.com Subject: Re: [Nextomics/NextDenovo] get_cns error (#7)

But, if you only have few tasks with this error, you can check the input.fofn for the sort task, such as: /nextdenovo/01.correct/01.raw_align/03.sort_align.sh.work/sort_align006/input.fofn, and rerurn the minimap task which produced the ovl files in input.fofn. and then re-run the corresponding sort task and cns task.

― You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Nextomics/NextDenovo/issues/7?email_source=notifications&email_token=AMUIB4WF6VU6WCRLAUUJXQDQA6WOFA5CNFSM4IEK6OOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2U4RQA#issuecomment-514443456, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AMUIB4UUMZFVV5J4PMGWQADQA6WOFANCNFSM4IEK6OOA.

moold commented 5 years ago

We have used NextDenovo for lots of large genomes with genome size from 10G to 30G, for large genome size, you can check FAQ IV (use usetempdir option and adjust -k -w for minimap2). Actually, for large genomes , if you want to correct the raw reads and then do the assembly, you almost have no other choices, that is why we developed NextDenovo.

moold commented 5 years ago

I think you can re-run one sort task, and check if this is ok, and then re-run all other unfinished tasks.

JHAO12321 commented 5 years ago

Yes, I am trying one task now. I hope this will work.

Is this bug specifically for pacbio data? I know many people do large genome data analysis using nextdenovo but they all used nanopore for sequencing. They have used an older version than me but nothing wrong happened.

And I have ssh to the compute node of one get_cns task, there are 24 subtasks but almost all of them are in S stat. Despite I set “-p 24”, the load of cpus is about 1~2. Is there anything wrong of my job? Does this mean, whatever the “-p” is, the task is actually using one cpu when running.

Best Regards Jiahao wangjiahao@genomics.cn

发件人: Hu Jiang notifications@github.com 答复: Nextomics/NextDenovo reply@reply.github.com 日期: 2019年7月24日 星期三 13:07 收件人: Nextomics/NextDenovo NextDenovo@noreply.github.com 抄送: "王佳昊(Jiahao Wang)" wangjiahao@genomics.cn, Author author@noreply.github.com 主题: Re: [Nextomics/NextDenovo] get_cns error (#7)

I think you can re-run one sort task, and check if this is ok, and then re-run all other unfinished tasks.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Nextomics/NextDenovo/issues/7?email_source=notifications&email_token=AMUIB4VVMRCQVBWB66YDWB3QA7PRTA5CNFSM4IEK6OOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2VFZGY#issuecomment-514481307, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AMUIB4QFCVOKKM5M4IK3HFTQA7PRTANCNFSM4IEK6OOA.

moold commented 5 years ago

1.No, not only for pacbio data.

  1. It means IO wait, please use usetempdir option, especially if usetempdir is on a SSD driver.
moold commented 5 years ago

I will close this issue, but if you still have this problem, feel free to reopen it.