Closed AndresICM closed 6 years ago
It looks like you're using an intermediate between 1.6 and 1.7, not a release. I would suggest switching to 1.7 if you can.
As for the error, it's most likely a java issue, what's the output in correction/1-overlapper/precompute.000001.out. Also, how much memory did you reserve for the job when you submitted it to the remote server?
Thanks for the quick response. that's the output
Running job 1 based on command line options.
/gpfs/hps/soft/rhel7/canu/1.6/Linux-amd64/bin/gatekeeperDumpFASTQ: /lib64/libc.so.6: version GLIBC_2.14' not found (required by /gpfs/hps/soft/rhel7/canu/1.6/Linux-amd64/bin/gatekeeperDumpFASTQ) mv: cannot stat
./blocks/000001.input.fasta': No such file or directory
Failed to extract fasta.
Ah, it's a library/linking error not a Canu error. Looks like the machine you're running on doesn't have the same library versions as the machine you compiled on but it's strange it didn't fail right away. Perhaps the compilation is corrupted and mixes two library versions. What does
ldd /gpfs/hps/soft/rhel7/canu/1.6/Linux-amd64/bin/gatekeeperDumpFASTQ
and
ldd /gpfs/hps/soft/rhel7/canu/1.6/Linux-amd64/bin/gatekeeperCreate
report?
Most likely you need to recompile after removing the binary directory or you could use the pre-compiled binaries from the release page.
I'm not sure of what do you mean or how to do that. gatekeeperDumpFASTQ and gatekeeperCreate shows a lot of some random characters. Don't know if there's any use of pasting them here.
Basically, whoever installed Canu didn't compile it correctly. ldd should be telling you what a program is linked against like this:
% ldd canu/Linux-amd64/bin/gatekeeperDumpFASTQ
linux-vdso.so.1 => (0x00007ffca9bbe000)
libstdc++.so.6 => /opt/sw/software/gcc/4.8.5/lib64/libstdc++.so.6 (0x00007ff8ef09d000)
libm.so.6 => /lib64/libm.so.6 (0x00000039fce00000)
libgomp.so.1 => /opt/sw/software/gcc/4.8.5/lib64/libgomp.so.1 (0x00007ff8eee7f000)
libgcc_s.so.1 => /opt/sw/software/gcc/4.8.5/lib64/libgcc_s.so.1 (0x00007ff8eec68000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00000039fd200000)
libc.so.6 => /lib64/libc.so.6 (0x00000039fca00000)
/lib64/ld-linux-x86-64.so.2 (0x00000039fc600000)
librt.so.1 => /lib64/librt.so.1 (0x00000039fda00000)
so I wanted to see if the binaries are linked to the same library or not. You could just try downloading the release binaries instead and run with those.
Idle, library/installation issue.
I also keep having this problem trying to test canu. Note that i'm a new user both in canu and linux. I downgraded Java to 8 and i installed canu 8 but as you can see from the folder it is named 1.8 for some reason. here is what i get
./canu-1.8/*/bin/canu -p ecoli -d ecoli-oxford genomeSize=4.8m -nanopore-raw oxford.fasta useGrid=false
-- Canu 1.8
--
-- CITATIONS
--
-- Koren S, Walenz BP, Berlin K, Miller JR, Phillippy AM.
-- Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.
-- Genome Res. 2017 May;27(5):722-736.
-- http://doi.org/10.1101/gr.215087.116
--
-- Koren S, Rhie A, Walenz BP, Dilthey AT, Bickhart DM, Kingan SB, Hiendleder S, Williams JL, Smith TPL, Phillippy AM.
-- De novo assembly of haplotype-resolved genomes with trio binning.
-- Nat Biotechnol. 2018
-- https//doi.org/10.1038/nbt.4277
--
-- Read and contig alignments during correction, consensus and GFA building use:
-- Šošic M, Šikic M.
-- Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.
-- Bioinformatics. 2017 May 1;33(9):1394-1395.
-- http://doi.org/10.1093/bioinformatics/btw753
--
-- Overlaps are generated using:
-- Berlin K, et al.
-- Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.
-- Nat Biotechnol. 2015 Jun;33(6):623-30.
-- http://doi.org/10.1038/nbt.3238
--
-- Myers EW, et al.
-- A Whole-Genome Assembly of Drosophila.
-- Science. 2000 Mar 24;287(5461):2196-204.
-- http://doi.org/10.1126/science.287.5461.2196
--
-- Corrected read consensus sequences are generated using an algorithm derived from FALCON-sense:
-- Chin CS, et al.
-- Phased diploid genome assembly with single-molecule real-time sequencing.
-- Nat Methods. 2016 Dec;13(12):1050-1054.
-- http://doi.org/10.1038/nmeth.4035
--
-- Contig consensus sequences are generated using an algorithm derived from pbdagcon:
-- Chin CS, et al.
-- Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
-- Nat Methods. 2013 Jun;10(6):563-9
-- http://doi.org/10.1038/nmeth.2474
--
-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '1.8.0_191' (from '/usr/lib/jvm/java-8-oracle/bin/java') with -d64 support.
-- Detected gnuplot version '5.2 patchlevel 2 ' (from 'gnuplot') and image format 'png'.
-- Detected 4 CPUs and 8 gigabytes of memory.
-- Detected Slurm with 'sinfo' binary in /usr/local/bin/sinfo.
-- Grid engine disabled per useGrid=false option.
--
-- (tag)Concurrency
-- (tag)Threads |
-- (tag)Memory | |
-- (tag) | | | total usage algorithm
-- ------- ------ -------- -------- ----------------- -----------------------------
-- Local: meryl 8 GB 4 CPUs x 1 job 8 GB 4 CPUs (k-mer counting)
-- Local: hap 8 GB 4 CPUs x 1 job 8 GB 4 CPUs (read-to-haplotype assignment)
-- Local: cormhap 6 GB 4 CPUs x 1 job 6 GB 4 CPUs (overlap detection with mhap)
-- Local: obtovl 4 GB 4 CPUs x 1 job 4 GB 4 CPUs (overlap detection)
-- Local: utgovl 4 GB 4 CPUs x 1 job 4 GB 4 CPUs (overlap detection)
-- Local: ovb 4 GB 1 CPU x 2 jobs 8 GB 2 CPUs (overlap store bucketizer)
-- Local: ovs 8 GB 1 CPU x 1 job 8 GB 1 CPU (overlap store sorting)
-- Local: red 8 GB 4 CPUs x 1 job 8 GB 4 CPUs (read error detection)
-- Local: oea 4 GB 1 CPU x 2 jobs 8 GB 2 CPUs (overlap error adjustment)
-- Local: bat 8 GB 4 CPUs x 1 job 8 GB 4 CPUs (contig construction with bogart)
-- Local: gfa 8 GB 4 CPUs x 1 job 8 GB 4 CPUs (GFA alignment and processing)
--
-- In 'ecoli.seqStore', found Nanopore reads:
-- Raw: 20365
-- Corrected: 0
-- Trimmed: 0
--
-- Generating assembly 'ecoli' in '/home/simon/ecoli-oxford'
--
-- Parameters:
--
-- genomeSize 4800000
--
-- Overlap Generation Limits:
-- corOvlErrorRate 0.3200 ( 32.00%)
-- obtOvlErrorRate 0.1200 ( 12.00%)
-- utgOvlErrorRate 0.1200 ( 12.00%)
--
-- Overlap Processing Limits:
-- corErrorRate 0.5000 ( 50.00%)
-- obtErrorRate 0.1200 ( 12.00%)
-- utgErrorRate 0.1200 ( 12.00%)
-- cnsErrorRate 0.2000 ( 20.00%)
--
--
-- BEGIN CORRECTION
--
--
-- Running jobs. First attempt out of 2.
----------------------------------------
-- Starting 'cormhap' concurrent execution on Thu Oct 25 13:36:03 2018 with 206.06 GB free disk space (3 processes; 1 concurrently)
cd correction/1-overlapper
./precompute.sh 1 > ./precompute.000001.out 2>&1
./precompute.sh 2 > ./precompute.000002.out 2>&1
./precompute.sh 3 > ./precompute.000003.out 2>&1
-- Finished on Thu Oct 25 13:36:03 2018 (in the blink of an eye) with 206.06 GB free disk space
----------------------------------------
--
-- Mhap precompute jobs failed, retry.
-- job correction/1-overlapper/blocks/000001.dat FAILED.
-- job correction/1-overlapper/blocks/000002.dat FAILED.
-- job correction/1-overlapper/blocks/000003.dat FAILED.
--
--
-- Running jobs. Second attempt out of 2.
----------------------------------------
-- Starting 'cormhap' concurrent execution on Thu Oct 25 13:36:03 2018 with 206.06 GB free disk space (3 processes; 1 concurrently)
cd correction/1-overlapper
./precompute.sh 1 > ./precompute.000001.out 2>&1
./precompute.sh 2 > ./precompute.000002.out 2>&1
./precompute.sh 3 > ./precompute.000003.out 2>&1
-- Finished on Thu Oct 25 13:36:03 2018 (in the blink of an eye) with 206.06 GB free disk space
----------------------------------------
--
-- Mhap precompute jobs failed, tried 2 times, giving up.
-- job correction/1-overlapper/blocks/000001.dat FAILED.
-- job correction/1-overlapper/blocks/000002.dat FAILED.
-- job correction/1-overlapper/blocks/000003.dat FAILED.
--
ABORT:
ABORT: Canu 1.8
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting. If that doesn't work, ask for help.
ABORT:
and then
cat /home/user/ecoli-oxford/correction/1-overlapper/precompute.*.out
Running job 1 based on command line options.
./precompute.sh: 81: ./precompute.sh: /usr/lib/canu/gatekeeperDumpFASTQ: not found
mv: cannot stat './blocks/000001.input.fasta': No such file or directory
Failed to extract fasta.
Running job 2 based on command line options.
./precompute.sh: 81: ./precompute.sh: /usr/lib/canu/gatekeeperDumpFASTQ: not found
mv: cannot stat './blocks/000002.input.fasta': No such file or directory
Failed to extract fasta.
Running job 3 based on command line options.
./precompute.sh: 81: ./precompute.sh: /usr/lib/canu/gatekeeperDumpFASTQ: not found
mv: cannot stat './blocks/000003.input.fasta': No such file or directory
Failed to extract fasta.
Could you please help me?
How did you install Canu? There is no Canu 8, the latest release is 1.8. However, this installation is not valid. It is missing the binaries needed to run Canu (or at least they are not where they should be). It's trying to find them in /usr/lib/canu/ which doesn't seem right and version 1.8 shouldn't have any files named gatekeeper*
.
Download the pre-compiled binaries for a release for your system and run using that install instead.
Hi Sergey and thank you for your response. I was just wondering are there precompiled binaries of canu for Ubuntu 18.04 because I can’t actually locate them.
Στις Πέμπτη, 25 Οκτωβρίου 2018, ο χρήστης Sergey Koren < notifications@github.com> έγραψε:
How did you install Canu? There is no Canu 8, the latest release is 1.8. However, this installation is not valid. It is missing the binaries needed to run Canu (or at least they are not where they should be). It's trying to find them in /usr/lib/canu/ which doesn't seem right and version 1.8 shouldn't have any files named gatekeeper*.
Download the pre-compiled binaries for a release for your system and run using that install instead.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/marbl/canu/issues/857#issuecomment-433087028, or mute the thread https://github.com/notifications/unsubscribe-auth/AqA-vOBwapUW4yCRgeR0SpMwCxCwYwm-ks5uodMegaJpZM4TI_fW .
problem solved by installing canu-1.8.Linux-amd64.tar.xz precompiled binaries for linux.
I would also like to ask if you can help set the meryl memory to sth like 7GB. Do you think it is possible?
./canu-1.8/*/bin/canu -p ecoli -d ecoli-oxford genomeSize=4.8m -nanopore-raw oxford.fasta useGrid=false
-- (tag)Concurrency -- (tag)Threads | -- (tag)Memory | -- (tag) | total usage algorithm |
---|
-- segments memory batches
ABORT: ABORT: Canu 1.8 ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped. ABORT: Try restarting. If that doesn't work, ask for help. ABORT: ABORT: failed to parse meryl configure output 'correction/0-mercounts/ecoli.ms16.config.01.out'. ABORT: ABORT: Disk space available: 207.951 GB ABORT: ABORT: Last 50 lines of the relevant log file (correction/0-mercounts/ecoli.ms16.config.01.out): ABORT: ABORT: equal-to N return kmers that occur exactly N times in the input. accepts exactly one input. ABORT: not-equal-to N return kmers that do not occur exactly N times in the input. accepts exactly one input. ABORT: ABORT: increase X add X to the count of each kmer. ABORT: decrease X subtract X from the count of each kmer. ABORT: multiply X multiply the count of each kmer by X. ABORT: divide X divide the count of each kmer by X. ABORT: modulo X set the count of each kmer to the remainder of the count divided by X. ABORT: ABORT: union return kmers that occur in any input, set the count to the number of inputs with this kmer. ABORT: union-min return kmers that occur in any input, set the count to the minimum count ABORT: union-max return kmers that occur in any input, set the count to the maximum count ABORT: union-sum return kmers that occur in any input, set the count to the sum of the counts ABORT: ABORT: intersect return kmers that occur in all inputs, set the count to the count in the first input. ABORT: intersect-min return kmers that occur in all inputs, set the count to the minimum count. ABORT: intersect-max return kmers that occur in all inputs, set the count to the maximum count. ABORT: intersect-sum return kmers that occur in all inputs, set the count to the sum of the counts. ABORT: ABORT: difference return kmers that occur in the first input, but none of the other inputs ABORT: symmetric-difference return kmers that occur in exactly one input ABORT: ABORT: MODIFIERS: ABORT: ABORT: output O write kmers generated by the present command to an output meryl database O ABORT: mandatory for count operations. ABORT: ABORT: EXAMPLES: ABORT: ABORT: Example: Report 22-mers present in at least one of input1.fasta and input2.fasta. ABORT: Kmers from each input are saved in meryl databases 'input1' and 'input2', ABORT: but the kmers in the union are only reported to the screen. ABORT: ABORT: meryl print \ ABORT: union \ ABORT: [count k=22 input1.fasta output input1] \ ABORT: [count k=22 input2.fasta output input2] ABORT: ABORT: Example: Find the highest count of each kmer present in both files, save the kmers to ABORT: database 'maxCount'. ABORT: ABORT: meryl intersect-max input1 input2 output maxCount ABORT: ABORT: Example: Find unique kmers common to both files. Brackets are necessary ABORT: on the first 'equal-to' command to prevent the second 'equal-to' from ABORT: being used as an input to the first 'equal-to'. ABORT: ABORT: meryl intersect [equal-to 1 input1] equal-to 1 input2 ABORT: ABORT: Requested memory 'memory=8' (GB) is more than physical memory 7.68 GB.
./canu-1.8/*/bin/canu -p ecoli -d ecoli-oxford genomeSize=4.8m -nanopore-raw oxford.fasta useGrid=false Memory=6
doesn't seem to work
You should use merylMemory, or better yet maxMemory. maxMemory=7 should limit all steps.
Yes I tried that and unfortunately it doesn't work.
-- (tag)Concurrency -- (tag)Threads | -- (tag)Memory | -- (tag) | total usage algorithm |
---|
-- segments memory batches
ABORT: ABORT: Canu 1.8 ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped. ABORT: Try restarting. If that doesn't work, ask for help. ABORT: ABORT: failed to parse meryl configure output 'correction/0-mercounts/ecoli.ms16.config.01.out'. ABORT: ABORT: Disk space available: 207.545 GB ABORT: ABORT: Last 50 lines of the relevant log file (correction/0-mercounts/ecoli.ms16.config.01.out): ABORT: ABORT: equal-to N return kmers that occur exactly N times in the input. accepts exactly one input. ABORT: not-equal-to N return kmers that do not occur exactly N times in the input. accepts exactly one input. ABORT: ABORT: increase X add X to the count of each kmer. ABORT: decrease X subtract X from the count of each kmer. ABORT: multiply X multiply the count of each kmer by X. ABORT: divide X divide the count of each kmer by X. ABORT: modulo X set the count of each kmer to the remainder of the count divided by X. ABORT: ABORT: union return kmers that occur in any input, set the count to the number of inputs with this kmer. ABORT: union-min return kmers that occur in any input, set the count to the minimum count ABORT: union-max return kmers that occur in any input, set the count to the maximum count ABORT: union-sum return kmers that occur in any input, set the count to the sum of the counts ABORT: ABORT: intersect return kmers that occur in all inputs, set the count to the count in the first input. ABORT: intersect-min return kmers that occur in all inputs, set the count to the minimum count. ABORT: intersect-max return kmers that occur in all inputs, set the count to the maximum count. ABORT: intersect-sum return kmers that occur in all inputs, set the count to the sum of the counts. ABORT: ABORT: difference return kmers that occur in the first input, but none of the other inputs ABORT: symmetric-difference return kmers that occur in exactly one input ABORT: ABORT: MODIFIERS: ABORT: ABORT: output O write kmers generated by the present command to an output meryl database O ABORT: mandatory for count operations. ABORT: ABORT: EXAMPLES: ABORT: ABORT: Example: Report 22-mers present in at least one of input1.fasta and input2.fasta. ABORT: Kmers from each input are saved in meryl databases 'input1' and 'input2', ABORT: but the kmers in the union are only reported to the screen. ABORT: ABORT: meryl print \ ABORT: union \ ABORT: [count k=22 input1.fasta output input1] \ ABORT: [count k=22 input2.fasta output input2] ABORT: ABORT: Example: Find the highest count of each kmer present in both files, save the kmers to ABORT: database 'maxCount'. ABORT: ABORT: meryl intersect-max input1 input2 output maxCount ABORT: ABORT: Example: Find unique kmers common to both files. Brackets are necessary ABORT: on the first 'equal-to' command to prevent the second 'equal-to' from ABORT: being used as an input to the first 'equal-to'. ABORT: ABORT: meryl intersect [equal-to 1 input1] equal-to 1 input2 ABORT: ABORT: Requested memory 'memory=8' (GB) is more than physical memory 7.68 GB. ABORT:
The scripts canu writes to run these jobs are not recreated when parameters change - the scripts in correction/0-mercounts all have the old memory size (8gb). Just entirely remove that 0-mercounts directory. Better, remove the whole output directory to start fresh.
Running with 8gb physical memory is, unfortunately, entirely untested.
I'm running Canu v1.6 on linux through the ont-assembly-polish pipeline (https://github.com/nanoporetech/ont-assembly-polish), on a remote server. The only Canu parameter I changed was useGrid=false. My genome size is around 7m. I get the following output, and I can't figure out what's the issue.