marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
655 stars 179 forks source link

Overlap store sorting error in running Canu 1.9 in VirtualBox #1808

Closed eburchard closed 4 years ago

eburchard commented 4 years ago

Hello all,

I am having an issue with Canu 1.9 running in a VirtualBox on a Windows 10 machine. These reads are from a minION sequencer on a 30m genome with a lot of coverage, over 600x. I am trying to run this within a shared folder, which from what I have seen in previous posts can be an issue, but I don't really have much choice here, since running it completely within the VM doesn't provide adequate space (I think the most space you can allocate is 2TB) and it just ends up freezing.

Here is the command I used initially

canu -p clado_all -d clado_all genomeSize=30m -nanopore-raw *.fastq ovsMemory=100

And the resource allocations

                         (tag)Concurrency
--                     (tag)Threads          |
--            (tag)Memory         |          |
--        (tag)         |         |          |     total usage     algorithm
--        -------  ------  --------   --------  -----------------  -----------------------------
-- Local: meryl     12 GB    3 CPUs x   7 jobs    84 GB   21 CPUs  (k-mer counting)
-- Local: hap        8 GB    3 CPUs x   7 jobs    56 GB   21 CPUs  (read-to-haplotype assignment)
-- Local: cormhap    6 GB    7 CPUs x   3 jobs    18 GB   21 CPUs  (overlap detection with mhap)
-- Local: obtovl     4 GB    7 CPUs x   3 jobs    12 GB   21 CPUs  (overlap detection)
-- Local: utgovl     4 GB    7 CPUs x   3 jobs    12 GB   21 CPUs  (overlap detection)
-- Local: cor        8 GB    4 CPUs x   5 jobs    40 GB   20 CPUs  (read correction)
-- Local: ovb        4 GB    1 CPU  x  21 jobs    84 GB   21 CPUs  (overlap store bucketizer)
-- Local: ovs      100 GB    1 CPU  x   1 job    100 GB    1 CPU   (overlap store sorting)
-- Local: red       16 GB    3 CPUs x   7 jobs   112 GB   21 CPUs  (read error detection)
-- Local: oea        8 GB    1 CPU  x  16 jobs   128 GB   16 CPUs  (overlap error adjustment)
-- Local: bat       16 GB    4 CPUs x   1 job     16 GB    4 CPUs  (contig construction with bogart)
-- Local: cns      --- GB    4 CPUs x --- jobs   --- GB  --- CPUs  (consensus)
-- Local: gfa       16 GB    4 CPUs x   1 job     16 GB    4 CPUs  (GFA alignment and processing)

And this is the error I get in the end during overlap store sorting

-- Finished on Thu Oct  1 17:31:11 2020 (29 seconds) with 7144.568 GB free disk space
----------------------------------------
-- Overlap store bucketizer finished.
-- No change in report.
-- Finished stage 'cor-overlapStoreBucketizerCheck', reset canuIteration.
-- No change in report.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'ovS' concurrent execution on Thu Oct  1 17:31:11 2020 with 7144.568 GB free disk space (2 processes; 1 concurrently)

    cd correction/clado_all.ovlStore.BUILDING
    ./scripts/2-sort.sh 1 > ./logs/2-sort.000001.out 2>&1
    ./scripts/2-sort.sh 2 > ./logs/2-sort.000002.out 2>&1

-- Finished on Thu Oct  1 17:31:51 2020 (40 seconds) with 7144.568 GB free disk space
----------------------------------------
--
-- Overlap store sorting jobs failed, retry.
--   job correction/clado_all.ovlStore.BUILDING/0001 FAILED.
--   job correction/clado_all.ovlStore.BUILDING/0002 FAILED.
--
-- No change in report.
--
-- Running jobs.  Second attempt out of 2.
----------------------------------------
-- Starting 'ovS' concurrent execution on Thu Oct  1 17:31:51 2020 with 7144.568 GB free disk space (2 processes; 1 concurrently)

    cd correction/clado_all.ovlStore.BUILDING
    ./scripts/2-sort.sh 1 > ./logs/2-sort.000001.out 2>&1
    ./scripts/2-sort.sh 2 > ./logs/2-sort.000002.out 2>&1

-- Finished on Thu Oct  1 17:32:31 2020 (40 seconds) with 7144.568 GB free disk space
----------------------------------------
--
-- Overlap store sorting jobs failed, tried 2 times, giving up.
--   job correction/clado_all.ovlStore.BUILDING/0001 FAILED.
--   job correction/clado_all.ovlStore.BUILDING/0002 FAILED.`

Can someone please help me? I apologize if I have posted inadequate information, but if more is needed please let me know and I will post it ASAP.

Thanks very much...

skoren commented 4 years ago

First, you don't really need 600x for assembly, you can randomly downsample to about 100-200x before starting the assembly. More recent versions of Canu than 1.9 do this automatically.

For the error, what is in the log for the sorting step (correction/clado_all.ovlStore.BUILDING/logs/*)?

eburchard commented 4 years ago

First, you don't really need 600x for assembly, you can randomly downsample to about 100-200x before starting the assembly. More recent versions of Canu than 1.9 do this automatically.

Agreed, but not really up to me lol

Here is 2-sort.000001.out

Found perl:
   /usr/bin/perl
   This is perl 5, version 30, subversion 0 (v5.30.0) built for x86_64-linux-gnu-thread-multi

Found java:
   /usr/bin/java
   openjdk version "11.0.8" 2020-07-14

Found canu:
   /usr/lib/canu/bin/canu
   Canu 1.9

Running job 1 based on command line options.

Attempting to increase maximum allowed processes and open files.
./scripts/2-sort.sh: 1: ulimit: Illegal option -u
./scripts/2-sort.sh: 1: ulimit: Illegal option -u
./scripts/2-sort.sh: 72: ulimit: Illegal option -u
./scripts/2-sort.sh: 1: ulimit: Illegal option -u
  Changed max processes per user from  to  (max ).
  Changed max open files from 1024 to 1048576 (max 1048576).

Finding overlaps.
  found   70131909 overlaps in './clado_all.ovlStore.BUILDING/bucket0001/sliceSizes'.
  found   69212954 overlaps in './clado_all.ovlStore.BUILDING/bucket0002/sliceSizes'.

Loading  139344863 overlaps using 3.11 GB of allowed (-M) 4 GB memory.
  loading   70131909 overlaps from './clado_all.ovlStore.BUILDING/bucket0001/slice0001'.
  loading   69212954 overlaps from './clado_all.ovlStore.BUILDING/bucket0002/slice0001'.

Sorting.

Writing sorted overlaps.
Failed to open './clado_all.ovlStore.BUILDING/0001<001>' for writing: Protocol error

Thanks for getting back to me so quickly!!!!

skoren commented 4 years ago

Unfortunately, as you suspected, it seems the error is due to the VM. It is also possible that the VM doesn't like Canu's file naming. Try creating a file with the special characters via touch "test<>". If that fails as well, it's likely the file name issue. In that case, updating to Canu 2.1 should fix it as it no longer uses the special characters.

If not then it's a VM filesystem issue. If it is the VM, there isn't much you can do in Canu to fix it. The easiest would be to either downsample or move the assembly folder to the non-shared space (assuming it fits).

eburchard commented 4 years ago

OK, thanks very much I appreciate your help! I'll try your suggestions!