marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
660 stars 179 forks source link

cannot find the corrected reads file #1792

Closed longzhangnation closed 4 years ago

longzhangnation commented 4 years ago

Hi ,I am using Canu for my nanopore sequence correction , and my command is like ~/canu-2.1/bin/canu -correct useGrid=true gridEngineArrayMaxJobs=500 -p test -d canu_zl genomeSize=1.3g -nanopore sampled_15x_pass_nano.fastq.gz . When the programe stopped , I checked the file in canu_zl/correction/ , I cannot find the corrected reads files . I see some fasta file under 1-overlapper/blocks/ . When I see the canu.out it shows like this


  Found perl:
    /usr/bin/perl
 perl: warning: Setting locale failed.
 perl: warning: Please check that your locale settings:
         LANGUAGE = (unset),
         LC_ALL = (unset),
         LANG = "en_US.UTF-8"
     are supported and installed on your system.
 perl: warning: Falling back to the standard locale ("C").
    This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux-thread-multi

 Found java:
    /BIGDATA1/app/jdk/8u141-b15-gcc-4.8.5/bin/java
    java version "1.8.0_141"

 Found canu:
    /BIGDATA1/sysu_mhwang_1/canu-2.1/bin/canu
 perl: warning: Setting locale failed.
 perl: warning: Please check that your locale settings:
         LANGUAGE = (unset),
         LC_ALL = (unset),
         LANG = "en_US.UTF-8"
     are supported and installed on your system.
 perl: warning: Falling back to the standard locale ("C").
    canu 2.1

 perl: warning: Setting locale failed.
 perl: warning: Please check that your locale settings:
         LANGUAGE = (unset),
         LC_ALL = (unset),
         LANG = "en_US.UTF-8"
     are supported and installed on your system.
 perl: warning: Falling back to the standard locale ("C").
 -- canu 2.1
 --
 -- CITATIONS
 --
 -- For 'standard' assemblies of PacBio or Nanopore reads:
 --   Koren S, Walenz BP, Berlin K, Miller JR, Phillippy AM.
 --   Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.
 --   Genome Res. 2017 May;27(5):722-736.
 --   http://doi.org/10.1101/gr.215087.116
 --
 -- Read and contig alignments during correction and consensus use:
 --   Šošic M, Šikic M.
 --   Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.
 --   Bioinformatics. 2017 May 1;33(9):1394-1395.
 --   http://doi.org/10.1093/bioinformatics/btw753
 --
 -- Overlaps are generated using:
 --   Berlin K, et al.
 --   Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.
 --   Nat Biotechnol. 2015 Jun;33(6):623-30.
 --   http://doi.org/10.1038/nbt.3238
 --
 --   Myers EW, et al.
 --   A Whole-Genome Assembly of Drosophila.
 --   Science. 2000 Mar 24;287(5461):2196-204.
 --   http://doi.org/10.1126/science.287.5461.2196
 --
 -- Corrected read consensus sequences are generated using an algorithm derived from FALCON-sense:
 --   Chin CS, et al.
 --   Phased diploid genome assembly with single-molecule real-time sequencing.
 --   Nat Methods. 2016 Dec;13(12):1050-1054.
 --   http://doi.org/10.1038/nmeth.4035
 --
 -- Contig consensus sequences are generated using an algorithm derived from pbdagcon:
 --   Chin CS, et al.
 --   Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
 --   Nat Methods. 2013 Jun;10(6):563-9
 --   http://doi.org/10.1038/nmeth.2474
 --
 -- CONFIGURE CANU
 --
 -- Detected Java(TM) Runtime Environment '1.8.0_141' (from '/BIGDATA1/app/jdk/8u141-b15-gcc-4.8.5/bin/java') with -d64 support.
 -- Detected gnuplot version '5.0 patchlevel 5   ' (from 'gnuplot') and image format 'svg'.
 Can't exec "getconf": No such file or directory at /BIGDATA1/sysu_mhwang_1/canu-2.1/bin/../lib/site_perl/canu/Defaults.pm line 318.
 -- Detected 0 CPUs and 63 gigabytes of memory.
 -- Detected Slurm with 'sinfo' binary in /usr/bin/sinfo.
 -- Detected Slurm with task IDs up to 1000 allowed.
 --
 -- Found 1954 hosts with  24 cores and   62 GB memory under Slurm control.
 --
 --                         (tag)Threads
 --                (tag)Memory         |
 --        (tag)             |         |  algorithm
 --        -------  ----------  --------  -----------------------------
 -- Grid:  meryl     31.000 GB    8 CPUs  (k-mer counting)
 -- Grid:  hap       16.000 GB   24 CPUs  (read-to-haplotype assignment)
 -- Grid:  cormhap   31.000 GB   12 CPUs  (overlap detection with mhap)
 -- Grid:  obtovl    16.000 GB   12 CPUs  (overlap detection)
 -- Grid:  utgovl    16.000 GB   12 CPUs  (overlap detection)
 -- Grid:  cor       24.000 GB    4 CPUs  (read correction)
 -- Grid:  ovb        4.000 GB    1 CPU   (overlap store bucketizer)
 -- Grid:  ovs       32.000 GB    1 CPU   (overlap store sorting)
 -- Grid:  red       20.000 GB    8 CPUs  (read error detection)
 -- Grid:  oea        8.000 GB    1 CPU   (overlap error adjustment)
 -- Grid:  bat       62.000 GB   16 CPUs  (contig construction with bogart)
 -- Grid:  cns        -.--- GB    8 CPUs  (consensus)
 --
 -- In 'test.seqStore', found Nanopore reads:
 --   Nanopore:                 1
 --
 --   Raw:                      1
 --
 -- Generating assembly 'test' in '/BIGDATA1/sysu_mhwang_1/canu_zl':
 --    - only correct raw reads.
 --
 -- Parameters:
 --
 --  genomeSize        1300000000
 --
 --  Overlap Generation Limits:
 --    corOvlErrorRate 0.3200 ( 32.00%)
 --    obtOvlErrorRate 0.1200 ( 12.00%)
 --    utgOvlErrorRate 0.1200 ( 12.00%)
 --
 --  Overlap Processing Limits:
 --    corErrorRate    0.5000 ( 50.00%)
 --    obtErrorRate    0.1200 ( 12.00%)
 --    utgErrorRate    0.1200 ( 12.00%)
 --    cnsErrorRate    0.2000 ( 20.00%)
 --
 --
 -- BEGIN CORRECTION
 --
 --
 -- OVERLAPPER (mhap) (correction) complete, not rewriting scripts.
 --
 --
 -- Mhap precompute jobs failed, tried 2 times, giving up.
 --   job correction/1-overlapper/blocks/000001.dat FAILED.
 --   job correction/1-overlapper/blocks/000002.dat FAILED.
 --   job correction/1-overlapper/blocks/000003.dat FAILED.
 --   job correction/1-overlapper/blocks/000004.dat FAILED.
 --   job correction/1-overlapper/blocks/000005.dat FAILED.
 --   job correction/1-overlapper/blocks/000006.dat FAILED.
 --   job correction/1-overlapper/blocks/000007.dat FAILED.
 --   job correction/1-overlapper/blocks/000008.dat FAILED.
 --   job correction/1-overlapper/blocks/000009.dat FAILED.
 --   job correction/1-overlapper/blocks/000010.dat FAILED.
 --   job correction/1-overlapper/blocks/000011.dat FAILED.
 --   job correction/1-overlapper/blocks/000012.dat FAILED.
 --   job correction/1-overlapper/blocks/000013.dat FAILED.
 --   job correction/1-overlapper/blocks/000014.dat FAILED.
 --   job correction/1-overlapper/blocks/000015.dat FAILED.
 --   job correction/1-overlapper/blocks/000016.dat FAILED.
 --   job correction/1-overlapper/blocks/000017.dat FAILED.
 --   job correction/1-overlapper/blocks/000018.dat FAILED.
 --   job correction/1-overlapper/blocks/000019.dat FAILED.
 --   job correction/1-overlapper/blocks/000020.dat FAILED.
 --

 ABORT:
 ABORT: canu 2.1
 ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
 ABORT: Try restarting.  If that doesn't work, ask for help.
 ABORT:

Should I worry about the error ? where can I find the corrected files ? Or should I restart the command ? Hope you can give me some advice .

skoren commented 4 years ago

Most likely a JVM issue (see #1764 for example), post one of the logs from the failed jobs to check what the issue is.

longzhangnation commented 4 years ago

Here is the log , I didnot find errors.

Found perl: /usr/bin/perl perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LANG = "zh_CN.utf8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux-thread-multi

Found java: /BIGDATA1/app/jdk/8u141-b15-gcc-4.8.5/bin/java java version "1.8.0_141"

Found canu: /BIGDATA1/sysu_mhwang_1/canu-2.1/bin/canu perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LANG = "zh_CN.utf8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). canu 2.1

Running job 18 based on SLURM_ARRAY_TASK_ID=18 and offset=0. /BIGDATA1/sysu_mhwang_1/canu_zl/correction/1-overlapper Opened seqStore '../../test.seqStore' for 'raw' reads. Dumping raw reads from 711451 to 753300 (inclusive).

Starting mhap precompute.

Running with these settings: --filter-threshold = 1.0E-7 --help = false --max-shift = 0.2 --min-olap-length = 500 --min-store-length = 0 --no-rc = false --no-self = false --no-tf = false --num-hashes = 768 --num-min-matches = 2 --num-threads = 12 --ordered-kmer-size = 12 --ordered-sketch-size = 1536 --repeat-idf-scale = 10.0 --repeat-weight = 0.9 --settings = 0 --store-full-id = true --supress-noise = 0 --threshold = 0.73 --version = false -f = ../../0-mercounts/test.ms16.ignore.gz -h = false -k = 16 -p = ./000018.input.fasta -q = . -s =

Reading in filter file ../../0-mercounts/test.ms16.ignore.gz. Read in values for repeat 0 and 0 Warning, k-mer filter file has zero elements. Initializing Initialized Time (s) to read filter file: 0.108373642 Read in k-mer filter for sizes: [] Processing FASTA files for binary compression... Current # sequences loaded and processed from file: 5000...

skoren commented 4 years ago

Yes, that would imply your grid is likely killing these jobs which would be similar to the issue I pointed to. Check the status of the jobs on the grid, the command to run them should be in 1-overlapper/precompute.jobSubmit*sh and their ID in 1-overlapper/precompute.jobSubmit*out. If that's the case you can try the same parameter change as in the previous issue to see if the jobs succeed.

skoren commented 4 years ago

Any update?

skoren commented 4 years ago

Idle