jts / nanocorrect

Experimental pipeline for correcting nanopore reads
MIT License
39 stars 10 forks source link

Error 1 when running nanocorrect-overlap.make #14

Open BoMatt opened 9 years ago

BoMatt commented 9 years ago

Hi! When i run nanocorrect-overlap.make i get this error:

make -f nanocorrect-overlap.make INPUT=~/MinION/sspace_butyricum/all_reads_buty.fasta NAME=cor

nanocorrect-preprocess.pl /home/mattia/MinION/sspace_butyricum/all_reads_buty.fasta > cor.pp.fasta fasta2DB cor cor.pp.fasta File cor.pp.fasta, Line 1502272: Warning: Read is longer than 2^16-1. Read truncated. File cor.pp.fasta, Line 1537720: Warning: Read is longer than 2^16-1. Read truncated. File cor.pp.fasta, Line 1845517: Warning: Read is longer than 2^16-1. Read truncated. File cor.pp.fasta, Line 1998480: Warning: Read is longer than 2^16-1. Read truncated. File cor.pp.fasta, Line 2004327: Warning: Read is longer than 2^16-1. Read truncated. File cor.pp.fasta, Line 2270391: Warning: Read is longer than 2^16-1. Read truncated. File cor.pp.fasta, Line 2357344: Warning: Read is longer than 2^16-1. Read truncated. File cor.pp.fasta, Line 2433997: Warning: Read is longer than 2^16-1. Read truncated. File cor.pp.fasta, Line 2648752: Warning: Read is longer than 2^16-1. Read truncated. File cor.pp.fasta, Line 2998765: Warning: Read is longer than 2^16-1. Read truncated. File cor.pp.fasta, Line 3020306: Warning: Read is longer than 2^16-1. Read truncated. DBsplit -s50 cor DBdust cor HPCdaligner -t5 -mdust cor > HPCcommands.txt /bin/bash HPCcommands.txt HPCcommands.txt: line 2: 22420 Segmentation fault (core dumped) daligner -t5 -mdust cor.1 cor.1 HPCcommands.txt: line 3: 22422 Segmentation fault (core dumped) daligner -t5 -mdust cor.2 cor.1 cor.2 daligner: Block cor.3 contains reads < 14bp long ! Run DBsplit. daligner: Block cor.4 contains reads < 14bp long ! Run DBsplit. LAsort: Cannot open ./cor.1.cor.1.C0.las for 'r' LAsort: Cannot open ./cor.1.cor.2.C0.las for 'r' LAsort: Cannot open ./cor.1.cor.3.C0.las for 'r' LAsort: Cannot open ./cor.1.cor.4.C0.las for 'r' LAsort: Cannot open ./cor.2.cor.1.C0.las for 'r' LAsort: Cannot open ./cor.2.cor.2.C0.las for 'r' LAsort: Cannot open ./cor.2.cor.3.C0.las for 'r' LAsort: Cannot open ./cor.2.cor.4.C0.las for 'r' LAsort: Cannot open ./cor.3.cor.1.C0.las for 'r' LAsort: Cannot open ./cor.3.cor.2.C0.las for 'r' LAsort: Cannot open ./cor.3.cor.3.C0.las for 'r' LAsort: Cannot open ./cor.3.cor.4.C0.las for 'r' LAsort: Cannot open ./cor.4.cor.1.C0.las for 'r' LAsort: Cannot open ./cor.4.cor.2.C0.las for 'r' LAsort: Cannot open ./cor.4.cor.3.C0.las for 'r' LAsort: Cannot open ./cor.4.cor.4.C0.las for 'r' LAmerge: Cannot open ./L1.1.1.las for 'r' LAmerge: Cannot open ./L1.2.1.las for 'r' LAmerge: Cannot open ./L1.3.1.las for 'r' LAmerge: Cannot open ./L1.4.1.las for 'r' nanocorrect-overlap.make:37: recipe for target 'cor.las' failed make: *\ [cor.las] Error 1

All dependencies are updated to the latest version, which can be the problem?

jts commented 9 years ago

The warnings about reads that are too long and too short (<14bp) might indicate the cause. Perhaps you can pre-filter your data to get rid of this and see if it works?

Is your input data 2D nanopore reads?

BoMatt commented 9 years ago

Yes,the input is 2D nanopore reads. I will try to filter out the short reads and see if it works.

BoMatt commented 9 years ago

In the FASTA there weren't reads shorter than 100bp, I tried to filter out also reads shorter than 500bp with two different tools but I'm getting the same error.

make -f nanocorrect-overlap.make INPUT=~/MinION/sspace_butyricum/all2Dbutyricum_filter_galaxy.fasta NAME=gala

nanocorrect-preprocess.pl /home/mattia/MinION/sspace_butyricum/all2Dbutyricum_filter_galaxy.fasta > gala.pp.fasta fasta2DB gala gala.pp.fasta DBsplit -s50 gala DBdust gala HPCdaligner -t5 -mdust gala > HPCcommands.txt /bin/bash HPCcommands.txt daligner: Block gala.1 contains reads < 14bp long ! Run DBsplit. LAsort: Cannot open ./gala.1.gala.1.C0.las for 'r' nanocorrect-overlap.make:37: recipe for target 'gala.las' failed make: *\ [gala.las] Error 1

jts commented 9 years ago

Can you make the input file available to me so I can look into this?

BoMatt commented 9 years ago

I don't know if it works in Google Drive, let me know. https://drive.google.com/file/d/0BzIk2g1Es23ZMURRdjFWUm9MOVE/view

jts commented 9 years ago

I just tested your file and it worked for me:

>make -f nanocorrect/nanocorrect-overlap.make INPUT=all2Dbutyricum.fasta NAME=test.v1
nanocorrect-preprocess.pl all2Dbutyricum.fasta > test.v1.pp.fasta
fasta2DB test.v1 test.v1.pp.fasta
DBsplit -s50 test.v1
DBdust test.v1
HPCdaligner -t5 -mdust test.v1 > HPCcommands.txt
/bin/bash HPCcommands.txt
LAcat test.v1 > test.v1.las
rm test.v1.*.las

Can you try it again in a completely clean directory? daligner uses hidden files and if these aren't removed it can make subsequent runs crash.

BoMatt commented 9 years ago

I tried it again but always the same error...I don't know how to solve it if I have to be honest, it seems a problem of my PC or of DALIGNER. Also because anytime I run it, then I delete the hidden files.

jts commented 9 years ago

Ok. What OS are you running and what versions of python/daligner?

BoMatt commented 9 years ago

Ubuntu 15.04, Python 2.7.9, daligner I think it's the last version because the git pull says that is already up to date.

jts commented 9 years ago

Can you run the exact command that I pasted above and send me the md5sums of all files?

BoMatt commented 9 years ago

Let me know if you need something else.

3b0c5878414f197d631e176ff90195f9 test.v1.db 6df6413e3fd993c5e89821698a66a16f test.v1.pp.fasta d54bd5b69d039b84f7420c9495d26ea4 HPCcommands.txt 3cfa46cfb891db70145ae6cbc246ba53 .test.v1.bps b921d466c5e407f8a5bf5e5ebd14f94c .test.v1.dust.anno 98a015b601fd277736877f622b9c089c .test.v1.dust.data 34fbe33e98d072c4b0cbfce5d09ecfa9 .test.v1.idx

jts commented 9 years ago

They match my files except for .test.v1.dust.anno, .test.v1.dust.data and .test.v1.idx. I don't know why running DBdust on the same file (test.v1.pp.fasta) would give different results.

joshquick commented 9 years ago

I’ve noticed that before whilst running the same input files on two different VM’s

On 18 Aug 2015, at 15:50, Jared Simpson notifications@github.com<mailto:notifications@github.com> wrote:

They match my files except for .test.v1.dust.anno, .test.v1.dust.data and .test.v1.idx. I don't know why running DBdust on the same file (test.v1.pp.fasta) would give different results.

— Reply to this email directly or view it on GitHubhttps://github.com/jts/nanocorrect/issues/14#issuecomment-132237979.

BoMatt commented 9 years ago

Tried to delete and reinstall DAZZ_DB, now it seems working!

jts commented 9 years ago

Great!

BoMatt commented 9 years ago

And now I have again the same error! It's quite strange...

BoMatt commented 9 years ago

I tried with different parameters, it seems that you have to create every time a new folder and delete to old one, or at least in this way it's working for me.

jts commented 9 years ago

Have you tried running DAZZ_DB/DBrm in between runs? Its the proper way of cleaning up the hidden files, which I think might be the problem

BoMatt commented 9 years ago

I will let you know when I'll do the next run, didn't know there was that command to remove the files. Thanks for the help!

seppinho commented 8 years ago

Hi, I'm currently experiencing the same issues as reported by MattiaBo with my 2D reads. Output /bin/bash HPCcommands.txt HPCcommands.txt: line 2: 19 Segmentation fault (core dumped) daligner -t5 -mdust test2.1 test2.1 LAsort: Cannot open ./test2.1.test2.1.C0.las for 'r'

I did everything within Docker and started with a fresh Ubuntu 14.04 image with python 2.7.6. So I don't think it's related to the database. Things I've installed:

I could also provide my fasta file if necessary. Thanks for your help! Sebastian

jts commented 8 years ago

I've just noticed that daligner and dazz_db were recently updated. Would you mind trying the versions that we used in our earlier paper to see if the old code works on your data? The commit IDs are here:

https://github.com/jts/nanopore-paper-analysis/blob/master/full-pipeline.make#L83

seppinho commented 8 years ago

Hej, indeed, that did the trick. Thanks a lot for the help!! DALIGNER version: git checkout 549da77b91395dd && make DAZZ_DB version: git checkout 8cb2f29c4011a2c2 && make

jts commented 8 years ago

Thanks for testing that. It looks like I'll have to modify nanocorrect to be compatible with the latest daligner

seppinho commented 8 years ago

happy to help

EvdH0 commented 8 years ago

I can second this, with the current DALIGNER and DAZZ_DB version I get a segmentation fault

make -f nanocorrect-overlap.make INPUT=2D.min800bp.fasta NAME=nc /bin/bash HPCcommands.txt HPCcommands.txt: line 2: 11942 Segmentation fault: 11 daligner -t5 -mdust nc.1 nc.1 LAsort: Cannot open ./nc.1.nc.1.C0.las for 'r' make: *\ [nc.las] Error 1

however going back to the two versions @seppinho mentioned and are in the makefile (https://github.com/jts/nanopore-paper-analysis/blob/master/full-pipeline.make#L83) it indeed works.

duspriya commented 6 years ago

Hi, I'm currently experiencing the same issues as reported

make -f nanocorrect-overlap.make INPUT=reads.fasta NAME=test.v1 nanocorrect-preprocess.pl reads.fasta > test.v1.pp.fasta /bin/bash: nanocorrect-preprocess.pl: command not found nanocorrect-overlap.make:14: recipe for target 'test.v1.pp.fasta' failed make: *** [test.v1.pp.fasta] Error 127 All dependencies are updated to the latest version, which can be the problem?

jts commented 6 years ago

Hi @duspriya,

As noted in the README, this software is deprecated and should not be used. I suggest using racon for error correction instead.

In case you are curious though the problem is that the nanocorrect-preprocess.pl script could not be found. You should check whether it is present in the same directory as nanocorrect-overlap.make.

Jared