marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
649 stars 178 forks source link

Canu possibly freezing, disconnected from server. #1173

Closed skbrimer closed 5 years ago

skbrimer commented 5 years ago

Hello Canu Team,

I am having an issue at the overlap step. I am trying to assembly an e coli genome from MinION reads. The command and error is below.

canu -p 551_17_selected -d 551_17_selected genomeSize=5m -nanopore-raw selectReads.fq

packet_write_wait: Connection to 18.220.13.112 port 22: Broken pipe

I am running Canu on an EC2 instance (c4,8xlarge 36 cpus 60GB RAM) which I think is enough for this task.

Here is the self configuring from the start of the program.

-- Detected 36 CPUs and 59 gigabytes of memory.
-- No grid engine detected, grid disabled.
--
--                            (tag)Concurrency
--                     (tag)Threads          |
--            (tag)Memory         |          |
--        (tag)         |         |          |     total usage     algorithm
--        -------  ------  --------   --------  -----------------  -----------------------------
-- Local: meryl      6 GB    4 CPUs x   9 jobs    54 GB   36 CPUs  (k-mer counting)
-- Local: hap        6 GB    4 CPUs x   9 jobs    54 GB   36 CPUs  (read-to-haplotype assignment)
-- Local: cormhap    6 GB   12 CPUs x   3 jobs    18 GB   36 CPUs  (overlap detection with mhap)
-- Local: obtovl     4 GB    6 CPUs x   6 jobs    24 GB   36 CPUs  (overlap detection)
-- Local: utgovl     4 GB    6 CPUs x   6 jobs    24 GB   36 CPUs  (overlap detection)
-- Local: cor      --- GB    4 CPUs x   1 job    --- GB    4 CPUs  (read correction)
-- Local: ovb        4 GB    1 CPU  x  14 jobs    56 GB   14 CPUs  (overlap store bucketizer)
-- Local: ovs        8 GB    1 CPU  x   7 jobs    56 GB    7 CPUs  (overlap store sorting)
-- Local: red        6 GB    4 CPUs x   9 jobs    54 GB   36 CPUs  (read error detection)
-- Local: oea        4 GB    1 CPU  x  14 jobs    56 GB   14 CPUs  (overlap error adjustment)
-- Local: bat       16 GB    4 CPUs x   1 job     16 GB    4 CPUs  (contig construction with bogart)
-- Local: cns      --- GB    4 CPUs x   1 job    --- GB    4 CPUs  (consensus)
-- Local: gfa        8 GB    4 CPUs x   1 job      8 GB    4 CPUs  (GFA alignment and processing)
--
-- Found Nanopore uncorrected reads in the input files.
--
-- Generating assembly '551_17_selected' in '/home/ubuntu/ecoli_551_17/551_17_selected'
--
-- Parameters:
--
--  genomeSize        5000000
--
--  Overlap Generation Limits:
--    corOvlErrorRate 0.3200 ( 32.00%)
--    obtOvlErrorRate 0.1200 ( 12.00%)
--    utgOvlErrorRate 0.1200 ( 12.00%)
--
--  Overlap Processing Limits:
--    corErrorRate    0.5000 ( 50.00%)
--    obtErrorRate    0.1200 ( 12.00%)
--    utgErrorRate    0.1200 ( 12.00%)
--    cnsErrorRate    0.2000 ( 20.00%)
--
--

As far as I can tell Canu is working fine and the instance is working correctly but it keeps getting hung up on the overlap step.

Here is the last output before it loses connection and terminates.

-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'obtovl' concurrent execution on Wed Dec  5 00:29:47 2018 with 213.249 GB free disk space (22 processes; 6 concurrently)

    cd trimming/1-overlapper
    ./overlap.sh 1 > ./overlap.000001.out 2>&1
    ./overlap.sh 2 > ./overlap.000002.out 2>&1
    ./overlap.sh 3 > ./overlap.000003.out 2>&1
    ./overlap.sh 4 > ./overlap.000004.out 2>&1
    ./overlap.sh 5 > ./overlap.000005.out 2>&1
    ./overlap.sh 6 > ./overlap.000006.out 2>&1
    ./overlap.sh 7 > ./overlap.000007.out 2>&1
    ./overlap.sh 8 > ./overlap.000008.out 2>&1
    ./overlap.sh 9 > ./overlap.000009.out 2>&1
    ./overlap.sh 10 > ./overlap.000010.out 2>&1
    ./overlap.sh 11 > ./overlap.000011.out 2>&1
    ./overlap.sh 12 > ./overlap.000012.out 2>&1
    ./overlap.sh 13 > ./overlap.000013.out 2>&1
    ./overlap.sh 14 > ./overlap.000014.out 2>&1
    ./overlap.sh 15 > ./overlap.000015.out 2>&1
    ./overlap.sh 16 > ./overlap.000016.out 2>&1
    ./overlap.sh 17 > ./overlap.000017.out 2>&1
    ./overlap.sh 18 > ./overlap.000018.out 2>&1
    ./overlap.sh 19 > ./overlap.000019.out 2>&1
    ./overlap.sh 20 > ./overlap.000020.out 2>&1
    ./overlap.sh 21 > ./overlap.000021.out 2>&1
    ./overlap.sh 22 > ./overlap.000022.out 2>&1
packet_write_wait: Connection to 18.220.13.112 port 22: Broken pipe

As you can see it just stops.

Do you have any ideas on what is going on?

Thank you, Sean

skoren commented 5 years ago

Doesn't look like any kind of canu error, your connection to the AWS instance is being terminated. You've probably got a timeout set on your ssh connection, you could use screen instead so you can re-connect to your session or change the ssh connection to avoid the timeouts.

Your canu run may still be active (assuming your instance is up and running) if you reconnect and check running processes.

skbrimer commented 5 years ago

I did check on the instance and it is still up and running but Canu stopped, so I took your advice and I am using screen this time. Thank you for the quick response!