mapleforest / HaploMerger2

41 stars 6 forks source link

Can't find .nib files #18

Open veronneaupy opened 6 years ago

veronneaupy commented 6 years ago

Hi,

I have problems running the second script ./hm.batchA2.chainNet_and_netToMaf . The log says : _Checking missing sizes files ... checking nib files in dipsaci13avril.seq/ ... no.nib no.nib ...

.nib files have been generated after 24h of computation (50 threads) from a 235Mo primary assembly (2400 contigs, N50=215,000, length= 253Mb (~2x too long)) and placed in my .seq and x.seq folders.

If I comment the following section in the HM_axtChainRecipBestNet.pl script, it seems to continue on with the other steps :

my $missing_nib = 0; my $sizeFH; foreach my $temp (@Species) { print "checking nib files in $temp.seq/ ...\n"; open($sizeFH, "<$temp.sizes") or die "Cant not open directory $temp.sizes! die!\n"; while (<$sizeFH>) { next unless m/([-.\w]+)\t/; unless (-f "$temp.seq/$1.nib") { $missing_nib=1; print "$1.nib\n"; } } close $sizeFH; } die "Some nib files are missing! Die!\n" if ($missing_nib == 1);

My .nib files are named like that in the MyProject/(name­).seq: tig00000001_len=2049576_reads=1570_covStat=2968.78_gappedBases=no_class=contig_suggestRepeat=no_suggestCircular=no.nib

Is it just a problem of finding those .nib files or there is an incompatibility with the script and my .nib names?

thanks for your help,

PY

veronneaupy commented 6 years ago

I realize that I have only 6 files out of the 2402 that finish with suggestCircular=yes.nib. My assembly was done with 12 SMRT cells of PacBio using CANU. What does it mean suggestCircular=yes? Can I just manually change them to =yes in order to proceed?

francicco commented 6 years ago

I found the same bug using the example data. So I guess the problem is not related to your data

mapleforest commented 6 years ago

dear all, possibility:

  1. HM2 can not access to its executables through the PATH correctly.
  2. ChainNet compilation is not compatible with your operation system. After checking these, if you still have the problem, you can pack the whole output file/directory (example1) and send it to me. I will analyze it carefully.

best regards, Shengfeng

mapleforest commented 6 years ago

see another thread if you encounter the same problem

https://github.com/mapleforest/HaploMerger2/issues/11#issue-277635624

francicco commented 6 years ago

I now compliled all 3rd-party software, including all kentUtils. At the step _A2.axtChainRecipBestNet I still get:

Can't open bbv18wmx.seq/scf220164595653.nib to read: No such file or directory
Can't open bbv18wmx.seq/scf220164595100.nib to read: No such file or directory
...

[fc464@login-e-12 HaploMerger2_TEST]$ ll bbv18wmx.seq/scf220164595653.nib
-rw-rw-r-- 1 fc464 39K Sep 14 14:49 bbv18wmx.seq/scf220164595653.nib

Why? F

mapleforest commented 6 years ago

I need more info.

So far, assuming the example1&2 can be run successfully with outputs similar to the standard correct ouput, the the problem is likely there are not enough openable filehandles (used limited to relax it).

Besides, if you can pack the all log files and control files under your working directly and send it to me. I could look into it.

----- 原始邮件 ----- 发件人: Francesco Cicconardi notifications@github.com 收件人: mapleforest/HaploMerger2 HaploMerger2@noreply.github.com 抄送: mapleforest hshengf2@mail.sysu.edu.cn, Comment comment@noreply.github.com 已发送邮件: Fri, 14 Sep 2018 22:06:22 +0800 (CST) 主题: Re: [mapleforest/HaploMerger2] Can't find .nib files (#18)

I now compliled all 3rd-party software, including all kentUtils. At the step _A2.axtChainRecipBestNet I still get:

Can't open bbv18wmx.seq/scf220164595653.nib to read: No such file or directory
Can't open bbv18wmx.seq/scf220164595100.nib to read: No such file or directory
...

[fc464@login-e-12 HaploMerger2_TEST]$ ll bbv18wmx.seq/scf220164595653.nib
-rw-rw-r-- 1 fc464 39K Sep 14 14:49 bbv18wmx.seq/scf220164595653.nib

Why? F

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/mapleforest/HaploMerger2/issues/18#issuecomment-421369262


本邮件及其附件含有发送给特定个人和用于特定目的的保密信息。如果您不是预期的收件人,请立即删除本邮件并通知发件人。严禁任何非预期的收件人使用、传播、分发或复制本邮件或其附件。 This email and its attachments may contain confidential information intended for a specific individual and purpose. If you are not the intended recipient, you should delete this email and notify the sender immediately. Any use, dissemination, distribution, or copying of this email or its attachments by persons other than the intended recipient(s), is strictly prohibited.

francicco commented 6 years ago

Attached you can find all log files for this commands:

cd /home/fc464/rds/rds-shm37-helixmbodyw/HaploMerger2_TEST
ulimit -n 655350 
./hm.batchA1.initiation_and_all_lastz bbv18wm
./hm.batchA2.chainNet_and_netToMaf bbv18wm
./hm.batchA3.misjoin_processing bbv18wm

_A2.axtChainRecipBestNet.log slurm-5170686.log _A3.faDnaPolishing.log _A3.misjoin_processing.log _A3.pathFinder_preparation.log _A3.pathFinder.log _A1.all_lastz.log

mapleforest commented 6 years ago

It seems there are several problems in your commands.

1) some log files are not included, like _A1.initiation.log. You may compare your log files with the correct ones in *\HaploMerger2_20180603\project_example3\test3_correct_output

2) it appears HM2 does not work from the very begining. It appears HM2 can not find its files and hence does not work (most likely due to you run your project in the wrong directory).

3) a project should be run in a project directory under the root directory of HM2, this is mandated to make the files and results well arranged and orderly! like: cd */HaploMerger2_xxx/my_test_project1/ sh ./hm.batchA1.initiation_and_all_lastz bbv18wm Make sure all the required data, control and batch files are put into this project directory.

4) the ulimit command does not work according to slurm-5170686.log. You need adminstration privilege to invoke ulimit, contact your adminstrator and ask him/she to lift the limit for your temporary. However, example1 & 2 (and 3 probably, depending on operation systems) do not require using "ulimit". This help you to test HM2 easily without contacting your administrator.

5) You may start with example1 & 2 by following the tutorial in the manual. Make sure all batch, scripts and exec can be accessed (and can be executed) through the environment PATH. Read through the manual will help you trouble most of the pitfalls, if not all.

I am looking forward to your feedback.

francicco commented 6 years ago

Sorry, it was a mistake of mine, I was skipping the commands to try to isolate the errors. Now all the log file again. And yes, ulimit is not working, I'm asking permission for it.

_A3.pathFinder.log _A1.all_lastz.log _A2.axtChainRecipBestNet.log slurm-5218890.log _A1.initiation.log _A3.faDnaPolishing.log _A3.misjoin_processing.log _A3.pathFinder_preparation.log

francicco commented 6 years ago

They gave me a limit of 65536 F

francicco commented 6 years ago

Works!!!! Thanks

mapleforest commented 6 years ago

What a delight! Be sure to make a limit of >500000 if you have tens of thousands of sequences, because 65536 may not sufficient to hold that number of files. Generally, 10X file handles are needed for the number of sequences (a requirement from chainNet programs).

francicco commented 6 years ago

Thanks a lot!!! F