Closed peflanag closed 6 years ago
You should use -nanopore-raw and don't specify -assemble. Porechop doesn't correct the data just trims it. Maybe also add -fast to save runtime.
The error is an incompatibility between the libraries on your computer and those Canu was built with. You should download the source and re-compile from source instead (see issue #821). The default OS X compiler doesn't support parallelism that Canu uses so you also want to set threads to 1:
cnsThreads=1 corThreads=1 cormhapThreads=1 obtmhapThreads=1 oeaThreads=1 ovbThreads=1 ovsThreads=1 redThreads=1 utgmhapThreads=1
I'll look into how we can package better for OS X to avoid this.
Im sorry to ask what I'm sure is a stupid question but what is the -fast command and where do I add to my script?
My script is:
canu \ -p [Name of output file] -d [directory to save too] genomeSize=2.8m \ -nanopore-raw [directory to porechop demultiplexed and trimmed file]
An easier install for macOS would be great! Beforehand I was spending weeks trying to figure out why it wasn't working and had to go to a walk in workshop in college to have them install it this morning.
You would add fast to the canu command so make it:
canu
-fast -p [Name of output file] -d [directory to save too]
genomeSize=2.8m
-nanopore-raw [directory to porechop demultiplexed and trimmed file]
It's our experimental option that saves significant compute but may produce a less contiguous assembly. It works pretty well on bacterial genomes though.
Can you try downloading the following tarball: https://gembox.cbcb.umd.edu/shared/canu-1.7.Darwin-amd64.tar.bz2. You can extract it using tar xvjf canu-1.7.Darwin-amd64.tar.bz2
. Canu you try seeing if you can run it instead, you don't need to run the full pipeline, just run canu-1.7/Darwin-amd64/bin/gatekeeperCreate
and post the output.
I downloaded and typed that command into terminal and got the following error
Last login: Fri Apr 6 14:14:50 on ttys001 MinIONs-iMac:~ minion$ xvjf canu-1.7.Darwin-amd64.tar.bz2 -bash: xvjf: command not found MinIONs-iMac:~ minion$
It should be tar xvjf canu-1.7.Darwin-amd64.tar.bz2
MinIONs-iMac:~ minion$ tar xvjf canu-1.7.Darwin-amd64.tar.bz2 tar: Error opening archive: Failed to open 'canu-1.7.Darwin-amd64.tar.bz2' MinIONs-iMac:~ minion$
If i double click on it in my downloads folder it unzipps
OK, what happens if you try to run the canu-1.7/Darwin-amd64/bin/gatekeeperCreate
command?
MinIONs-iMac:~ minion$ canu-1.7/Darwin-amd64/bin/gatekeeperCreate -bash: canu-1.7/Darwin-amd64/bin/gatekeeperCreate: No such file or directory MinIONs-iMac:~ minion$
But Im guessing should I move the unzipped file to the directory where the canu-1.7 file is that one of the lads in the high computer centre originally installed it for me? He installed it and placed it in a directory for me so when i launch terminal I can just type canu for it to work like I do with the BWA commands.
Either that or you can give the full path to wherever you downloaded/unzipped the tar.
I moved the Darwin download you sent me to the location the guy installed it for me this morning and ran the code again but it failed
MinIONs-iMac:~ minion$ canu-1.7/Darwin-amd64/bin/gatekeeperCreate -bash: canu-1.7/Darwin-amd64/bin/gatekeeperCreate: No such file or directory MinIONs-iMac:~ minion$
If you moved it to the same folder and replaced the previous one it should be /Users/minion/Documents/Apps/canu-1.7/Darwin-amd64/bin/gatekeeperCreate
Im so sorry, I'm not terminal savvy! I've redone that and got this:
MinIONs-iMac:~ minion$ /Users/minion/Documents/Apps/canu-1.7/Darwin-amd64/bin/gatekeeperCreate dyld: lazy symbol binding failed: Symbol not found: ___emutls_get_address Referenced from: /Users/minion/Documents/Apps/canu-1.7/Darwin-amd64/bin/..//lib/libgomp.1.dylib Expected in: /usr/lib/libSystem.B.dylib
dyld: Symbol not found: ___emutls_get_address Referenced from: /Users/minion/Documents/Apps/canu-1.7/Darwin-amd64/bin/..//lib/libgomp.1.dylib Expected in: /usr/lib/libSystem.B.dylib
Abort trap: 6 MinIONs-iMac:~ minion$
Ah OK, guess it won't be that easy, you'll have to build from source. What is your version of OS X?
High Sierra 10.13.3
although Ive been prompted with an update to 10.13.4 just there now!
One last thing to try before defaulting to building from source:
xcode-select --install
and confirm you have the file /usr/lib/libSystem.B.dylib
. If it still doesn't work after that, download the source code rather than the OS X binary and follow instructions on the release notes to compile.
MinIONs-iMac:~ minion$ xcode-select --install xcode-select: error: command line tools are already installed, use "Software Update" to install updates MinIONs-iMac:~ minion$
Xcode isn't installed on the mac though so I am downloading via the app store
Ok I've now got Xcode installed via the Mac App Store
I dont have the above file
MinIONs-iMac:~ minion$ -a Finder /usr/lib/libSystem.B.dylib -bash: -a: command not found MinIONs-iMac:~ minion$
OK, I think I found the cause of the issue. Try downloading the same link as before again (https://gembox.cbcb.umd.edu/shared/canu-1.7.Darwin-amd64.tar.bz2), download and extract, move to same location, and try running /Users/minion/Documents/Apps/canu-1.7/Darwin-amd64/bin/gatekeeperCreate
and see what happens now.
I updated the latest release binaries and they should include the required libraries to run as long as you have OS X 10.12 or newer. I've confirmed it works on a machine without OpenMP/GCC installed as well.
Sorry I had left the office when you replied with the above. I will try this this afternoon! Cheers
Hi Skoren, I have downloaded that file, replaced the old and ran the above script. This is the result I got. does this look right to you?
Last login: Fri Apr 6 16:03:56 on ttys000 MinIONs-iMac:~ minion$ /Users/minion/Documents/Apps/canu-1.7/Darwin-amd64/bin/gatekeeperCreate usage: /Users/minion/Documents/Apps/canu-1.7/Darwin-amd64/bin/gatekeeperCreate [-minlength L] -o gkpStore input.gkp -o gkpStore load raw reads into new gkpStore -minlength L discard reads shorter than L
ERROR: no gkpStore (-o) supplied. ERROR: no input files supplied. MinIONs-iMac:~ minion$
OK so it works on your system now, you should be able to run the assembly you were running before.
Cheers. I just tried it there but there seems to be an issue with the genomeSize command now
MinIONs-iMac:~ minion$ canu \
-p 300OR1 -d /Users/minion/Desktop/AoC\ MinION\ Seq/Canu\ Output \
genomeSize=2.8m \
-nanopore-raw /Users/minion/Desktop/AoC\ MinION\ Seq/Porechop\ Files/BC01.fastq
usage: canu [-version] [-citation] \
[-correct | -trim | -assemble | -trim-assemble] \
[-s <assembly-specifications-file>] \
-p <assembly-prefix> \
-d <assembly-directory> \
genomeSize=<number>[g|m|k] \
[other-options] \
[-pacbio-raw |
-pacbio-corrected |
-nanopore-raw |
-nanopore-corrected] file1 file2 ...
example: canu -d run1 -p godzilla genomeSize=1g -nanopore-raw reads/*.fasta.gz
To restrict canu to only a specific stage, use:
-correct - generate corrected reads
-trim - generate trimmed reads
-assemble - generate an assembly
-trim-assemble - generate trimmed reads and then assemble them
The assembly is computed in the -d <assembly-directory>, with output files named
using the -p <assembly-prefix>. This directory is created if needed. It is not
possible to run multiple assemblies in the same directory.
The genome size should be your best guess of the haploid genome size of what is being
assembled. It is used primarily to estimate coverage in reads, NOT as the desired
assembly size. Fractional values are allowed: '4.7m' equals '4700k' equals '4700000'
Some common options:
useGrid=string
- Run under grid control (true), locally (false), or set up for grid control
but don't submit any jobs (remote)
rawErrorRate=fraction-error
- The allowed difference in an overlap between two raw uncorrected reads. For lower
quality reads, use a higher number. The defaults are 0.300 for PacBio reads and
0.500 for Nanopore reads.
correctedErrorRate=fraction-error
- The allowed difference in an overlap between two corrected reads. Assemblies of
low coverage or data with biological differences will benefit from a slight increase
in this. Defaults are 0.045 for PacBio reads and 0.144 for Nanopore reads.
gridOptions=string
- Pass string to the command used to submit jobs to the grid. Can be used to set
maximum run time limits. Should NOT be used to set memory limits; Canu will do
that for you.
minReadLength=number
- Ignore reads shorter than 'number' bases long. Default: 1000.
minOverlapLength=number
- Ignore read-to-read overlaps shorter than 'number' bases long. Default: 500.
A full list of options can be printed with '-options'. All options can be supplied in
an optional sepc file with the -s option.
Reads can be either FASTA or FASTQ format, uncompressed, or compressed with gz, bz2 or xz.
Reads are specified by the technology they were generated with, and any processing performed:
-pacbio-raw <files> Reads are straight off the machine.
-pacbio-corrected <files> Reads have been corrected.
-nanopore-raw <files>
-nanopore-corrected <files>
Complete documentation at http://canu.readthedocs.org/en/latest/
ERROR: File 'genomeSize=2.8m' supplied on command line, don't know what to do with it.
Don't use spaces in the filenames, I'm not sure those will get properly escaped.
Unfortunately its the same error
MinIONs-iMac:~ minion$ canu \
-p 300OR1 -d /Users/minion/Desktop \
genomeSize=2.8m \
-nanopore-raw /Users/minion/Desktop/BC01.fastq
usage: canu [-version] [-citation] \
[-correct | -trim | -assemble | -trim-assemble] \
[-s <assembly-specifications-file>] \
-p <assembly-prefix> \
-d <assembly-directory> \
genomeSize=<number>[g|m|k] \
[other-options] \
[-pacbio-raw |
-pacbio-corrected |
-nanopore-raw |
-nanopore-corrected] file1 file2 ...
example: canu -d run1 -p godzilla genomeSize=1g -nanopore-raw reads/*.fasta.gz
To restrict canu to only a specific stage, use:
-correct - generate corrected reads
-trim - generate trimmed reads
-assemble - generate an assembly
-trim-assemble - generate trimmed reads and then assemble them
The assembly is computed in the -d <assembly-directory>, with output files named
using the -p <assembly-prefix>. This directory is created if needed. It is not
possible to run multiple assemblies in the same directory.
The genome size should be your best guess of the haploid genome size of what is being
assembled. It is used primarily to estimate coverage in reads, NOT as the desired
assembly size. Fractional values are allowed: '4.7m' equals '4700k' equals '4700000'
Some common options:
useGrid=string
- Run under grid control (true), locally (false), or set up for grid control
but don't submit any jobs (remote)
rawErrorRate=fraction-error
- The allowed difference in an overlap between two raw uncorrected reads. For lower
quality reads, use a higher number. The defaults are 0.300 for PacBio reads and
0.500 for Nanopore reads.
correctedErrorRate=fraction-error
- The allowed difference in an overlap between two corrected reads. Assemblies of
low coverage or data with biological differences will benefit from a slight increase
in this. Defaults are 0.045 for PacBio reads and 0.144 for Nanopore reads.
gridOptions=string
- Pass string to the command used to submit jobs to the grid. Can be used to set
maximum run time limits. Should NOT be used to set memory limits; Canu will do
that for you.
minReadLength=number
- Ignore reads shorter than 'number' bases long. Default: 1000.
minOverlapLength=number
- Ignore read-to-read overlaps shorter than 'number' bases long. Default: 500.
A full list of options can be printed with '-options'. All options can be supplied in
an optional sepc file with the -s option.
Reads can be either FASTA or FASTQ format, uncompressed, or compressed with gz, bz2 or xz.
Reads are specified by the technology they were generated with, and any processing performed:
-pacbio-raw <files> Reads are straight off the machine.
-pacbio-corrected <files> Reads have been corrected.
-nanopore-raw <files>
-nanopore-corrected <files>
Complete documentation at http://canu.readthedocs.org/en/latest/
ERROR: File 'genomeSize=2.8m' supplied on command line, don't know what to do with it.
Is there a file 'genomeSize=2.8m' in the directory where you ran Canu? That's the only way that message can be printed.
Theres not. But I will make a new file directory and try now.
Same error. Im just wondering, do I have to but a " - " before genomeSize? The rest of the script has it except that line
Nope, you don't need the -. I'd guess there is a bad character somewhere in the command line, are you copying/pasting it? Try re-typing it and using local paths is fine:
canu -p 300OR1 -d asm genomeSize=2.8m -nanopore-raw /Users/minion/Desktop/BC01.fastq
I was typing with the local paths. But I will give it another shot!
Same again unfortunately. Just curious as to what the asm referes to in your script after -d above
MinIONs-iMac:~ minion$ canu -p 300OR1 -d /Users/minion/Desktop genomeSize=2.8m -nanopore-raw /Users/minion/Desktop/BC01.fastq
I've confirmed the syntax is correct and I can run your command locally. I don't think this is a Canu error but a command-line issue. In fact, nothing in the canu script has changed from the first version you downloaded and ran initially so you should be able to re-run that command unless something has changed on your terminal. If it still doesn't work, I would suggest making sure you can run the tutorial assemblies on the quick start page first.
Is that just this command?
MinIONs-iMac:~ minion$ canu
usage: canu [-version] [-citation] \
[-correct | -trim | -assemble | -trim-assemble] \
[-s <assembly-specifications-file>] \
-p <assembly-prefix> \
-d <assembly-directory> \
genomeSize=<number>[g|m|k] \
[other-options] \
[-pacbio-raw |
-pacbio-corrected |
-nanopore-raw |
-nanopore-corrected] file1 file2 ...
example: canu -d run1 -p godzilla genomeSize=1g -nanopore-raw reads/*.fasta.gz
To restrict canu to only a specific stage, use:
-correct - generate corrected reads
-trim - generate trimmed reads
-assemble - generate an assembly
-trim-assemble - generate trimmed reads and then assemble them
The assembly is computed in the -d <assembly-directory>, with output files named
using the -p <assembly-prefix>. This directory is created if needed. It is not
possible to run multiple assemblies in the same directory.
The genome size should be your best guess of the haploid genome size of what is being
assembled. It is used primarily to estimate coverage in reads, NOT as the desired
assembly size. Fractional values are allowed: '4.7m' equals '4700k' equals '4700000'
Some common options:
useGrid=string
- Run under grid control (true), locally (false), or set up for grid control
but don't submit any jobs (remote)
rawErrorRate=fraction-error
- The allowed difference in an overlap between two raw uncorrected reads. For lower
quality reads, use a higher number. The defaults are 0.300 for PacBio reads and
0.500 for Nanopore reads.
correctedErrorRate=fraction-error
- The allowed difference in an overlap between two corrected reads. Assemblies of
low coverage or data with biological differences will benefit from a slight increase
in this. Defaults are 0.045 for PacBio reads and 0.144 for Nanopore reads.
gridOptions=string
- Pass string to the command used to submit jobs to the grid. Can be used to set
maximum run time limits. Should NOT be used to set memory limits; Canu will do
that for you.
minReadLength=number
- Ignore reads shorter than 'number' bases long. Default: 1000.
minOverlapLength=number
- Ignore read-to-read overlaps shorter than 'number' bases long. Default: 500.
A full list of options can be printed with '-options'. All options can be supplied in
an optional sepc file with the -s option.
Reads can be either FASTA or FASTQ format, uncompressed, or compressed with gz, bz2 or xz.
Reads are specified by the technology they were generated with, and any processing performed:
-pacbio-raw <files> Reads are straight off the machine.
-pacbio-corrected <files> Reads have been corrected.
-nanopore-raw <files>
-nanopore-corrected <files>
Complete documentation at http://canu.readthedocs.org/en/latest/
MinIONs-iMac:~ minion$
I downloaded the oxford data from the link on the quick start and ran the command. It seems to be working fine. It hasn't finished but for me to copy the terminal window in here but there is no errors
I'm not sure what you're asking, that is the help of the command when you don't specify any options. You are getting it reported because the options specified are invalid/not parsed correctly.
If the quick start is running that means Canu is working correctly. You can stop that run and remove the ecoli-oxford folder. Try the same command you used for quickstart just update the genome size and read location and see if that runs.
Sorry I thought that was what you wanted me to run but I ran the ones from the quick start window and it seems to be working.
Last login: Mon Apr 9 16:08:43 on ttys000
MinIONs-iMac:~ minion$ curl -L -o oxford.fasta http://nanopore.s3.climb.ac.uk/MAP006-PCR-1_2D_pass.fasta
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 138M 100 138M 0 0 1683k 0 0:01:24 0:01:24 --:--:-- 2627k
MinIONs-iMac:~ minion$ canu -p ecoli -d ecoli-oxford genomeSize=4.8m -nanopore-raw oxford.fasta
-- Canu snapshot v1.7 +0 changes (r8692 c9ef9219a265e0bbe3a311cca7d28aa02b7517d3)
--
-- CITATIONS
--
-- Koren S, Walenz BP, Berlin K, Miller JR, Phillippy AM.
-- Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.
-- Genome Res. 2017 May;27(5):722-736.
-- http://doi.org/10.1101/gr.215087.116
--
-- Read and contig alignments during correction, consensus and GFA building use:
-- Šošic M, Šikic M.
-- Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.
-- Bioinformatics. 2017 May 1;33(9):1394-1395.
-- http://doi.org/10.1093/bioinformatics/btw753
--
-- Overlaps are generated using:
-- Berlin K, et al.
-- Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.
-- Nat Biotechnol. 2015 Jun;33(6):623-30.
-- http://doi.org/10.1038/nbt.3238
--
-- Myers EW, et al.
-- A Whole-Genome Assembly of Drosophila.
-- Science. 2000 Mar 24;287(5461):2196-204.
-- http://doi.org/10.1126/science.287.5461.2196
--
-- Li H.
-- Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences.
-- Bioinformatics. 2016 Jul 15;32(14):2103-10.
-- http://doi.org/10.1093/bioinformatics/btw152
--
-- Corrected read consensus sequences are generated using an algorithm derived from FALCON-sense:
-- Chin CS, et al.
-- Phased diploid genome assembly with single-molecule real-time sequencing.
-- Nat Methods. 2016 Dec;13(12):1050-1054.
-- http://doi.org/10.1038/nmeth.4035
--
-- Contig consensus sequences are generated using an algorithm derived from pbdagcon:
-- Chin CS, et al.
-- Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
-- Nat Methods. 2013 Jun;10(6):563-9
-- http://doi.org/10.1038/nmeth.2474
--
-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '9.0.4' (from 'java').
-- Detected gnuplot version '5.2 patchlevel 2' (from 'gnuplot') and image format 'png'.
-- Detected 8 CPUs and 32 gigabytes of memory.
-- No grid engine detected, grid disabled.
--
-- (tag)Concurrency
-- (tag)Threads |
-- (tag)Memory | |
-- (tag) | | | total usage algorithm
-- ------- ------ -------- -------- ----------------- -----------------------------
-- Local: meryl 8 GB 4 CPUs x 1 job 8 GB 4 CPUs (k-mer counting)
-- Local: cormhap 6 GB 8 CPUs x 1 job 6 GB 8 CPUs (overlap detection with mhap)
-- Local: obtovl 4 GB 8 CPUs x 1 job 4 GB 8 CPUs (overlap detection)
-- Local: utgovl 4 GB 8 CPUs x 1 job 4 GB 8 CPUs (overlap detection)
-- Local: ovb 4 GB 1 CPU x 8 jobs 32 GB 8 CPUs (overlap store bucketizer)
-- Local: ovs 8 GB 1 CPU x 4 jobs 32 GB 4 CPUs (overlap store sorting)
-- Local: red 4 GB 4 CPUs x 2 jobs 8 GB 8 CPUs (read error detection)
-- Local: oea 4 GB 1 CPU x 8 jobs 32 GB 8 CPUs (overlap error adjustment)
-- Local: bat 16 GB 4 CPUs x 1 job 16 GB 4 CPUs (contig construction)
-- Local: gfa 8 GB 4 CPUs x 1 job 8 GB 4 CPUs (GFA alignment and processing)
--
-- Found Nanopore uncorrected reads in the input files.
--
-- Generating assembly 'ecoli' in '/Users/minion/ecoli-oxford'
--
-- Parameters:
--
-- genomeSize 4800000
--
-- Overlap Generation Limits:
-- corOvlErrorRate 0.3200 ( 32.00%)
-- obtOvlErrorRate 0.1440 ( 14.40%)
-- utgOvlErrorRate 0.1440 ( 14.40%)
--
-- Overlap Processing Limits:
-- corErrorRate 0.5000 ( 50.00%)
-- obtErrorRate 0.1440 ( 14.40%)
-- utgErrorRate 0.1440 ( 14.40%)
-- cnsErrorRate 0.1920 ( 19.20%)
--
--
-- BEGIN CORRECTION
--
----------------------------------------
-- Starting command on Mon Apr 9 16:11:25 2018 with 646.795 GB free disk space
cd .
/Users/minion/Documents/Apps/canu-1.7/Darwin-amd64/bin/gatekeeperCreate \
-minlength 1000 \
-o ./ecoli.gkpStore.BUILDING \
./ecoli.gkpStore.gkp \
> ./ecoli.gkpStore.BUILDING.err 2>&1
-- Finished on Mon Apr 9 16:11:27 2018 (2 seconds) with 646.75 GB free disk space
----------------------------------------
--
-- In gatekeeper store './ecoli.gkpStore':
-- Found 20365 reads.
-- Found 140042151 bases (29.17 times coverage).
--
-- Read length histogram (one '*' equals 41.48 reads):
-- 0 999 0
-- 1000 1999 706 *****************
-- 2000 2999 1682 ****************************************
-- 3000 3999 1624 ***************************************
-- 4000 4999 1543 *************************************
-- 5000 5999 1905 *********************************************
-- 6000 6999 2691 ****************************************************************
-- 7000 7999 2904 **********************************************************************
-- 8000 8999 2609 **************************************************************
-- 9000 9999 1946 **********************************************
-- 10000 10999 1280 ******************************
-- 11000 11999 733 *****************
-- 12000 12999 397 *********
-- 13000 13999 181 ****
-- 14000 14999 109 **
-- 15000 15999 38
-- 16000 16999 9
-- 17000 17999 4
-- 18000 18999 2
-- 19000 19999 0
-- 20000 20999 0
-- 21000 21999 0
-- 22000 22999 1
-- 23000 23999 0
-- 24000 24999 0
-- 25000 25999 1
--
-- Running jobs. First attempt out of 2.
----------------------------------------
-- Starting 'meryl' concurrent execution on Mon Apr 9 16:11:27 2018 with 646.75 GB free disk space (1 processes; 1 concurrently)
cd correction/0-mercounts
./meryl.sh 1 > ./meryl.000001.out 2>&1
-- Finished on Mon Apr 9 16:11:41 2018 (14 seconds) with 646.419 GB free disk space
----------------------------------------
-- Meryl finished successfully.
--
-- 16-mers Fraction
-- Occurrences NumMers Unique Total
-- 1- 1 70555151 *******************************************************************--> 0.8655 0.5049
-- 2- 2 4917952 ********************************************************************** 0.9259 0.5753
-- 3- 4 1399886 ******************* 0.9380 0.5965
-- 5- 7 905449 ************ 0.9466 0.6188
-- 8- 11 1612461 ********************** 0.9586 0.6683
-- 12- 16 1552473 ********************** 0.9789 0.7922
-- 17- 22 494738 ******* 0.9949 0.9293
-- 23- 29 52236 0.9993 0.9785
-- 30- 37 9061 0.9997 0.9851
-- 38- 46 4073 0.9998 0.9870
-- 47- 56 2676 0.9998 0.9881
-- 57- 67 1989 0.9999 0.9891
-- 68- 79 2326 0.9999 0.9900
-- 80- 92 2011 0.9999 0.9912
-- 93- 106 1225 1.0000 0.9924
-- 107- 121 636 1.0000 0.9933
-- 122- 137 517 1.0000 0.9938
-- 138- 154 349 1.0000 0.9942
-- 155- 172 166 1.0000 0.9946
-- 173- 191 107 1.0000 0.9948
-- 192- 211 80 1.0000 0.9949
-- 212- 232 60 1.0000 0.9950
-- 233- 254 53 1.0000 0.9951
-- 255- 277 37 1.0000 0.9952
-- 278- 301 33 1.0000 0.9953
-- 302- 326 29 1.0000 0.9954
-- 327- 352 21 1.0000 0.9954
-- 353- 379 27 1.0000 0.9955
-- 380- 407 17 1.0000 0.9955
-- 408- 436 19 1.0000 0.9956
-- 437- 466 14 1.0000 0.9956
-- 467- 497 13 1.0000 0.9957
-- 498- 529 17 1.0000 0.9957
-- 530- 562 20 1.0000 0.9958
-- 563- 596 10 1.0000 0.9959
-- 597- 631 16 1.0000 0.9959
-- 632- 667 10 1.0000 0.9960
-- 668- 704 9 1.0000 0.9960
-- 705- 742 8 1.0000 0.9961
-- 743- 781 11 1.0000 0.9961
-- 782- 821 6 1.0000 0.9962
--
-- 13740 (max occurrences)
-- 69181525 (total mers, non-unique)
-- 10960962 (distinct mers, non-unique)
-- 70555151 (unique mers)
-- For mhap overlapping, set repeat k-mer threshold to 1397.
--
-- Found 139736676 16-mers; 81516113 distinct and 70555151 unique. Largest count 13740.
--
-- OVERLAPPER (mhap) (correction)
--
-- Set corMhapSensitivity=high based on read coverage of 29.
--
-- PARAMETERS: hashes=768, minMatches=2, threshold=0.78
--
-- Given 6 GB, can fit 9000 reads per block.
-- For 4 blocks, set stride to 2 blocks.
-- Logging partitioning to 'correction/1-overlapper/partitioning.log'.
-- Configured 3 mhap precompute jobs.
-- Configured 3 mhap overlap jobs.
--
-- Running jobs. First attempt out of 2.
----------------------------------------
-- Starting 'cormhap' concurrent execution on Mon Apr 9 16:11:42 2018 with 646.75 GB free disk space (3 processes; 1 concurrently)
cd correction/1-overlapper
./precompute.sh 1 > ./precompute.000001.out 2>&1
I got it working!! Thank you so much for all your help! There was a hidden genomeSize file that I scoured the computer for and found! Once I deleted that and moved the folder to the home directory it worked! Again, thanks so much!
No problem, since you're running on a small-ish machine you may want to add -fast to the command too, it will save some compute and should be comparable assembly for bacterial genomes.
I'll do that going forward because we are using Canu to assemble bacterial genomes. Would it work with yeast genomes too? Candida albicans?
It will work with any genome, it just might give you a less contiguous assembly.
Cool. Cheers for all your help! I really appreciate it!
Hi Skoren, sorry to bother you again. I am trying to run canu on a students nanopore reads that have been trimmed and demultiplexed with porechop. however I am getting a new error with canu which I have pasted below. I must point out that the reads are not as long as usual.
Gatekeeper detected potential problems in your input reads.
Please review the logging in files: /Users/minion/Sarah110418/CanuOutput/M160427.gkpStore.BUILDING.err /Users/minion/Sarah110418/CanuOutput/M160427.gkpStore.BUILDING/errorLog
If you wish to proceed, rename the store with the following command and restart canu.
mv /Users/minion/Sarah110418/CanuOutput/M160427.gkpStore.BUILDING \ /Users/minion/Sarah110418/CanuOutput/M160427.gkpStore.ACCEPTED
If i still want to run it I am uncertain what it means by rename the "store" what is the store? do I write as follows:
canu -fast -p [Name] -d [Output directory] genomeSize=3.6m -nanopore-raw [Input File] mv /Users/minion/Sarah110418/CanuOutput/M160427.gkpStore.BUILDING \ /Users/minion/Sarah110418/CanuOutput/M160427.gkpStore.ACCEPTED
Essentially, it's warning you too many of your reads were filtered out, most likely due to length. You can see the full log in /Users/minion/Sarah110418/CanuOutput/M160427.gkpStore.BUILDING/errorLog
.
If you're OK with the filtering you can follow the Canu instructions, you would just run the command it gives you as is mv /Users/minion/Sarah110418/CanuOutput/M160427.gkpStore.BUILDING /Users/minion/Sarah110418/CanuOutput/M160427.gkpStore.ACCEPTED
and re-launch the original Canu command as before.
If you always want it to ignore that reads were filtered and assemble anyway, add stopOnReadQuality=false
to your Canu command line (or put it in a file name canu.defaults in /Users/minion/Documents/Apps/canu-1.7/Darwin-amd64/bin/.
Ok cool. I will try that again. Thanks so much.
Hi Skoren, sorry to bother you again. I have only got around to looking at this now. So after the fail, I still want to run and typed the command as given:
mv /Users/minion/Desktop/Sarah 11_04_18/Canu Output/BC01.gkpStore.BUILDING \ /Users/minion/Desktop/Sarah 11_04_18/Canu Output/BC01.gkpStore.ACCEPTED
However I get this:
MinIONs-iMac:~ minion$ mv /Users/minion/Desktop/Sarah 11_04_18/Canu Output/BC01.gkpStore.BUILDING \
/Users/minion/Desktop/Sarah 11_04_18/Canu Output/BC01.gkpStore.ACCEPTED
usage: mv [-f | -i | -n] [-v] source target mv [-f | -i | -n] [-v] source ... directory MinIONs-iMac:~ minion$
What does the -f -i -n and -v stand for?
Cheers!
That's just the mv command usage. You can get information on most commands using man. For example man mv
.
The problem is you truncated your command when you copied it, the \
means it continues onto the next line but that part didn't get copied. You need the full command:
mv /Users/minion/Sarah110418/CanuOutput/M160427.gkpStore.BUILDING /Users/minion/Sarah110418/CanuOutput/M160427.gkpStore.ACCEPTED
.
or just erase the canu output folder (Canu Output) and re-start with stopOnReadQuality=false
.
Hi All,
I'm new to running Canu and quite a novice at using terminal commands. I managed to get Canu installed on an iMac that I use for Nanopore sequencing. I am looking to run an assembly with Canu but it just doesn't seem to work! Any advice on what I am doing wrong? I have attached the copy of the script from terminal.
Am I right in assuming:
-p is the name i want for the output file? -d the location to but the new file?
I have trimmed and demultiplexed with Porechop so I am also assuming the last commend is nanopore-corrected and then the directory to the file that I want to align?
Cheers