mikolmogorov / Flye

De novo assembler for single molecule sequencing reads using repeat graphs
Other
763 stars 165 forks source link

Flye crashes during repeat or contigger steps #188

Closed Psy-Fer closed 4 years ago

Psy-Fer commented 4 years ago

Hello,

I'd like to start out by saying I really like Flye. It's great, and the latest improvements have been great.

I am however having some trouble with some of my trickier genome assembies.

I ran Flye like so

#!/bin/bash
#$ -S /bin/bash
#$ -cwd
#$ -N FLYE
#$ -l mem_requested=40G
#$ -l h_vmem=40G
#$ -pe smp 80 

READS=${1}
THREADS=80

source /home/jamfer/work/venv2714/bin/activate

/usr/bin/time -v /home/jamfer/work/Flye/bin/flye --threads ${THREADS} --asm-coverage 30 -g 3g --nano-raw ${READS} --resume --iterations 2 --out-dir asm 

qstat -j ${SGE_JOBID} | grep usage

With a fastq as input

[2019-11-14 13:20:16] INFO: Starting Flye 2.5-release
[2019-11-14 13:20:16] INFO: Resuming previous run
[2019-11-14 13:20:16] INFO: >>>STAGE: repeat
[2019-11-14 13:20:16] INFO: Building and resolving repeat graph
[2019-11-14 13:20:16] INFO: Reading sequences
[2019-11-14 13:31:52] INFO: Building repeat graph
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2019-11-15 11:33:53] INFO: Median overlap divergence: 0.069523
[2019-11-16 04:15:26] INFO: Aligning reads to the graph
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2019-11-17 09:18:57] INFO: Aligned read sequence: 96694913872 / 114451191639 (0.844857)
[2019-11-17 09:18:57] INFO: Median overlap divergence: 0.151014
[2019-11-17 09:20:45] INFO: Mean edge coverage: 61
[2019-11-17 14:03:21] INFO: Resolving repeats
[2019-11-20 16:19:32] ERROR: Caught unhandled exception: std::bad_alloc
[2019-11-20 16:19:32] ERROR:    flye-repeat(_Z16exceptionHandlerv+0xa6) [0x4b8a46]
[2019-11-20 16:19:32] ERROR:    /share/ClusterShare/software/contrib/shacar/gcc/gcc-5.5.0/5.5.0/lib64/libstdc++.so.6(+0x8c9e6) [0x2b0153d4b9e6]
[2019-11-20 16:19:32] ERROR:    /share/ClusterShare/software/contrib/shacar/gcc/gcc-5.5.0/5.5.0/lib64/libstdc++.so.6(+0x8ca31) [0x2b0153d4ba31]
[2019-11-20 16:19:32] ERROR:    /share/ClusterShare/software/contrib/shacar/gcc/gcc-5.5.0/5.5.0/lib64/libstdc++.so.6(+0x8cc49) [0x2b0153d4bc49]
[2019-11-20 16:19:32] ERROR:    /share/ClusterShare/software/contrib/shacar/gcc/gcc-5.5.0/5.5.0/lib64/libstdc++.so.6(+0x8d15c) [0x2b0153d4c15c]
[2019-11-20 16:19:32] ERROR:    flye-repeat(_ZNSt6vectorIP9GraphEdgeSaIS1_EE19_M_emplace_back_auxIJRKS1_EEEvDpOT_+0x4e) [0x43f64e]
[2019-11-20 16:19:32] ERROR:    flye-repeat(_ZN15OutputGenerator9outputDotERKSt6vectorI15UnbranchingPathSaIS1_EERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0xd41) [0x487131]
[2019-11-20 16:19:32] ERROR:    flye-repeat(main+0xcf1) [0x4325a1]
[2019-11-20 16:19:32] ERROR:    /lib64/libc.so.6(__libc_start_main+0xf5) [0x2b0154792495]
[2019-11-20 16:19:32] ERROR:    flye-repeat() [0x432ecf]
[2019-11-20 18:19:09] ERROR: Command '['flye-repeat', '--disjointigs', '/directflow/KCCGGenometechTemp/projects/jamfer/flye/asm/10-consensus/consensus.fasta', '--reads', '/directflow/KCCGGenometechTemp/projects/jamfer/data/sample_pass.fastq', '--out-dir', '/directflow/KCCGGenometechTemp/projects/jamfer/flye/asm/20-repeat', '--config', '/home/jamfer/work/Flye/flye/config/bin_cfg/asm_raw_reads.cfg', '--log', '/directflow/KCCGGenometechTemp/projects/jamferflye/asm/flye.log', '--threads', '80', '--min-ovlp', '1000', '--kmer', '17']' returned non-zero exit status -6
Command exited with non-zero status 1
    Command being timed: "/home/jamfer/work/Flye/bin/flye --threads 80 --asm-coverage 30 -g 3g --nano-raw /directflow/KCCGGenometechTemp/projects/jamfer/data/sample_pass.fastq --resume --iterations 2 --out-dir asm"
    User time (seconds): 14604393.48
    System time (seconds): 74081.98
    Percent of CPU this job got: 2736%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 148:58:54
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 1927083604
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 351
    Minor (reclaiming a frame) page faults: 15402535796
    Voluntary context switches: 80318608
    Involuntary context switches: 158512309
    Swaps: 0
    File system inputs: 116400
    File system outputs: 3854598528
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 1

from the flye.log file
...
[2019-11-17 14:08:43] DEBUG:          475198    num:1   flank:934   span:14380
[2019-11-17 14:08:43] DEBUG:        T -508730   num:1   flank:631   span:410
[2019-11-17 14:08:43] DEBUG:          511786    num:1   flank:916   span:10362
[2019-11-17 14:08:43] DEBUG: Mult: -416037  43519   39   (0,1)
[2019-11-17 14:08:43] DEBUG: Starting -527606 aln:55 minSpan:32
[2019-11-17 14:08:43] DEBUG:          -419194   num:9   flank:9532  span:2062
[2019-11-17 14:08:43] DEBUG:      L   386956    num:11  flank:3556  span:2
[2019-11-17 14:08:43] DEBUG: Mult: 527606   52840   35   (1,0)
[2019-11-17 14:08:43] DEBUG: Starting 526127 aln:14 minSpan:27086
[2019-11-17 14:08:43] DEBUG:          529776    num:1   flank:6350  span:5144
[2019-11-17 14:08:43] DEBUG:          -409792   num:1   flank:1799  span:2521
[2019-11-17 14:08:43] DEBUG: Mult: -526127  121557  14   (1,0)
[2019-11-17 14:10:43] DEBUG: Writing Dot
[2019-11-20 16:19:32] ERROR: Caught unhandled exception: std::bad_alloc
[2019-11-20 16:19:32] ERROR:    flye-repeat(_Z16exceptionHandlerv+0xa6) [0x4b8a46]
[2019-11-20 16:19:32] ERROR:    /share/ClusterShare/software/contrib/shacar/gcc/gcc-5.5.0/5.5.0/lib64/libstdc++.so.6(+0x8c9e6) [0x2b0153d4b9e6]
[2019-11-20 16:19:32] ERROR:    /share/ClusterShare/software/contrib/shacar/gcc/gcc-5.5.0/5.5.0/lib64/libstdc++.so.6(+0x8ca31) [0x2b0153d4ba31]
[2019-11-20 16:19:32] ERROR:    /share/ClusterShare/software/contrib/shacar/gcc/gcc-5.5.0/5.5.0/lib64/libstdc++.so.6(+0x8cc49) [0x2b0153d4bc49]
[2019-11-20 16:19:32] ERROR:    /share/ClusterShare/software/contrib/shacar/gcc/gcc-5.5.0/5.5.0/lib64/libstdc++.so.6(+0x8d15c) [0x2b0153d4c15c]
[2019-11-20 16:19:32] ERROR:    flye-repeat(_ZNSt6vectorIP9GraphEdgeSaIS1_EE19_M_emplace_back_auxIJRKS1_EEEvDpOT_+0x4e) [0x43f64e]
[2019-11-20 16:19:32] ERROR:    flye-repeat(_ZN15OutputGenerator9outputDotERKSt6vectorI15UnbranchingPathSaIS1_EERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0xd41) [0x487131]
[2019-11-20 16:19:32] ERROR:    flye-repeat(main+0xcf1) [0x4325a1]
[2019-11-20 16:19:32] ERROR:    /lib64/libc.so.6(__libc_start_main+0xf5) [0x2b0154792495]
[2019-11-20 16:19:32] ERROR:    flye-repeat() [0x432ecf]
[2019-11-20 18:19:09] root: ERROR: Command '['flye-repeat', '--disjointigs', '/directflow/KCCGGenometechTemp/projects/jamfer/flye/asm/10-consensus/consensus.fasta', '--reads', '/directflow/KCCGGenometechTemp/projects/jamfer/data/sample_pass.fastq', '--out-dir', '/directflow/KCCGGenometechTemp/projects/jamfer/flye/asm/20-repeat', '--config', '/home/jamfer/work/Flye/flye/config/bin_cfg/asm_raw_reads.cfg', '--log', '/directflow/KCCGGenometechTemp/projects/jamfer/flye/asm/flye.log', '--threads', '80', '--min-ovlp', '1000', '--kmer', '17']' returned non-zero exit status -6

This has happened quite a lot, and i'd also like to not the following time stamps [2019-11-17 14:08:43] DEBUG: Mult: -526127 121557 14 (1,0) [2019-11-17 14:10:43] DEBUG: Writing Dot [2019-11-20 16:19:32] ERROR: Caught unhandled exception: std::bad_alloc

There is something weird going on, because i'm giving it 80x40G of RAM (3.2TB) and it's taking a long time, and still blowing up.

Is there something in the memory allocation that needs to load everything into memory to resolve?

Any help on this would be appreciated

Kind Regards, James

mikolmogorov commented 4 years ago

Hi,

This could be related to #160. Could you try to get the latest version from flye branch and resume from repeat stage?

Mikhail

Psy-Fer commented 4 years ago

Hello,

Okay, latest version installed, and running using --resume

I'll let you know how it goes after the weekend.

Cheers

Psy-Fer commented 4 years ago

Hello, So that resume just failed agian after a week of sitting on the same Writing Dot bit.

[2019-11-22 20:49:59] INFO: Starting Flye 2.6-release
[2019-11-22 20:49:59] INFO: Resuming previous run
[2019-11-22 20:49:59] INFO: >>>STAGE: repeat
[2019-11-22 20:49:59] INFO: Building and resolving repeat graph
[2019-11-22 20:49:59] INFO: Reading sequences
[2019-11-22 21:05:46] INFO: Building repeat graph
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2019-11-23 19:19:33] INFO: Median overlap divergence: 0.0695257
[2019-11-24 17:01:21] INFO: Aligning reads to the graph
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2019-11-25 20:08:13] INFO: Aligned read sequence: 96690183481 / 114451191639 (0.844816)
[2019-11-25 20:08:13] INFO: Median overlap divergence: 0.150937
[2019-11-25 20:09:58] INFO: Mean edge coverage: 61
[2019-11-25 23:48:28] INFO: Resolving repeats
[2019-11-29 00:08:57] ERROR: Caught unhandled exception: std::bad_alloc
[2019-11-29 00:08:57] ERROR:    flye-repeat(_Z16exceptionHandlerv+0x30) [0x4b6ba0]
[2019-11-29 00:08:57] ERROR:    /share/ClusterShare/software/contrib/evaben7/gcc/8.2.0/gcc-8.2.0/lib64/libstdc++.so.6(+0x91db6) [0x2b3941b87db6]
[2019-11-29 00:08:57] ERROR:    /share/ClusterShare/software/contrib/evaben7/gcc/8.2.0/gcc-8.2.0/lib64/libstdc++.so.6(+0x91df1) [0x2b3941b87df1]
[2019-11-29 00:08:57] ERROR:    /share/ClusterShare/software/contrib/evaben7/gcc/8.2.0/gcc-8.2.0/lib64/libstdc++.so.6(+0x92024) [0x2b3941b88024]
[2019-11-29 00:08:57] ERROR:    /share/ClusterShare/software/contrib/evaben7/gcc/8.2.0/gcc-8.2.0/lib64/libstdc++.so.6(+0x9250c) [0x2b3941b8850c]
[2019-11-29 00:08:57] ERROR:    flye-repeat(_ZNSt6vectorIP9GraphEdgeSaIS1_EE17_M_realloc_insertIJRKS1_EEEvN9__gnu_cxx17__normal_iteratorIPS1_S3_EEDpOT_+0x5a) [0x43b1ea]
[2019-11-29 00:08:57] ERROR:    flye-repeat(_ZN15OutputGenerator9outputDotERKSt6vectorI15UnbranchingPathSaIS1_EERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x1f92) [0x481d62]
[2019-11-29 00:08:57] ERROR:    flye-repeat(main+0xf57) [0x42ec27]
[2019-11-29 00:08:57] ERROR:    /lib64/libc.so.6(__libc_start_main+0xf5) [0x2b39425d2495]
[2019-11-29 00:08:57] ERROR:    flye-repeat() [0x42f29f]
[2019-11-29 01:58:54] ERROR: Command '['flye-repeat', '--disjointigs', '/directflow/KCCGGenometechTemp/projects/jamfer/flye/asm/10-consensus/consensus.fasta', '--reads', '/directflow/KCCGGenometechTemp/projects/jamfer/data/sample_pass.fastq', '--out-dir', '/directflow/KCCGGenometechTemp/projects/jamfer/flye/asm/20-repeat', '--config', '/home/jamfer/work/Flye_2.6/flye/config/bin_cfg/asm_raw_reads.cfg', '--log', '/directflow/KCCGGenometechTemp/projects/jamfer/flye/asm/flye.log', '--threads', '80', '--min-ovlp', '1000', '--kmer', '17']' died with <Signals.SIGABRT: 6>.
Command exited with non-zero status 1
    Command being timed: "/home/jamfer/work/Flye_2.6/bin/flye --threads 80 --asm-coverage 30 -g 3g --nano-raw /directflow/KCCGGenometechTemp/projects/jamfer/data/sample_pass.fastq --resume --iterations 2 --out-dir asm"
    User time (seconds): 14087717.84
    System time (seconds): 80284.00
    Percent of CPU this job got: 2638%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 149:08:56
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 1900957936
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 372
    Minor (reclaiming a frame) page faults: 17630527614
    Voluntary context switches: 81696314
    Involuntary context switches: 153059551
    Swaps: 0
    File system inputs: 113232
    File system outputs: 3802347144
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 1

flye.log

[2019-11-25 23:55:40] DEBUG: Mult: -503565  43519   40   (0,1)
[2019-11-25 23:55:40] DEBUG: Starting 525867 aln:55 minSpan:54
[2019-11-25 23:55:40] DEBUG:          -514913   num:9   flank:9530  span:2094
[2019-11-25 23:55:40] DEBUG:      L   485229    num:11  flank:3556  span:4
[2019-11-25 23:55:40] DEBUG: Mult: -525867  52840   35   (1,0)
[2019-11-25 23:55:40] DEBUG: Starting 530080 aln:14 minSpan:27086
[2019-11-25 23:55:40] DEBUG:          527760    num:1   flank:6350  span:5144
[2019-11-25 23:55:40] DEBUG:          85244 num:1   flank:1799  span:2521
[2019-11-25 23:55:40] DEBUG: Mult: 530080   121557  14   (0,1)
[2019-11-25 23:58:30] DEBUG: Writing Dot
-bash-4.2$ tail asm/flye.log 
[2019-11-29 00:08:57] ERROR:    /share/ClusterShare/software/contrib/evaben7/gcc/8.2.0/gcc-8.2.0/lib64/libstdc++.so.6(+0x91db6) [0x2b3941b87db6]
[2019-11-29 00:08:57] ERROR:    /share/ClusterShare/software/contrib/evaben7/gcc/8.2.0/gcc-8.2.0/lib64/libstdc++.so.6(+0x91df1) [0x2b3941b87df1]
[2019-11-29 00:08:57] ERROR:    /share/ClusterShare/software/contrib/evaben7/gcc/8.2.0/gcc-8.2.0/lib64/libstdc++.so.6(+0x92024) [0x2b3941b88024]
[2019-11-29 00:08:57] ERROR:    /share/ClusterShare/software/contrib/evaben7/gcc/8.2.0/gcc-8.2.0/lib64/libstdc++.so.6(+0x9250c) [0x2b3941b8850c]
[2019-11-29 00:08:57] ERROR:    flye-repeat(_ZNSt6vectorIP9GraphEdgeSaIS1_EE17_M_realloc_insertIJRKS1_EEEvN9__gnu_cxx17__normal_iteratorIPS1_S3_EEDpOT_+0x5a) [0x43b1ea]
[2019-11-29 00:08:57] ERROR:    flye-repeat(_ZN15OutputGenerator9outputDotERKSt6vectorI15UnbranchingPathSaIS1_EERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x1f92) [0x481d62]
[2019-11-29 00:08:57] ERROR:    flye-repeat(main+0xf57) [0x42ec27]
[2019-11-29 00:08:57] ERROR:    /lib64/libc.so.6(__libc_start_main+0xf5) [0x2b39425d2495]
[2019-11-29 00:08:57] ERROR:    flye-repeat() [0x42f29f]
[2019-11-29 01:58:54] root: ERROR: Command '['flye-repeat', '--disjointigs', '/directflow/KCCGGenometechTemp/projects/jamfer/flye/asm/10-consensus/consensus.fasta', '--reads', '/directflow/KCCGGenometechTemp/projects/jamfer/data/sample_pass.fastq', '--out-dir', '/directflow/KCCGGenometechTemp/projects/jamfer/flye/asm/20-repeat', '--config', '/home/jamfer/work/Flye_2.6/flye/config/bin_cfg/asm_raw_reads.cfg', '--log', '/directflow/KCCGGenometechTemp/projects/jamfer/flye/asm/flye.log', '--threads', '80', '--min-ovlp', '1000', '--kmer', '17']' died with <Signals.SIGABRT: 6>.

What now?

mikolmogorov commented 4 years ago

Based on the log file (line "Flye 2.6-release"), you were running the release version, but not the latest github code from flye branch.

Please make sure you get the latest github source (via git clone), build it, and run the built version (and not the one that is currently in your system). The firs log line should be Starting Flye 2.6-g6f887ae.

Psy-Fer commented 4 years ago

Hey,

So i have done

git clone https://github.com/fenderglass/Flye.git
cd Flye
source ~/work/venv363/bin/activate
python setup.py install

however when I look at the version, it all says 2.6-release. I made the flye file in the binary manually point to the build/lib/flye/main.py, but it still says release.

Am I doing something wrong? Apologies if i'm doing something silly. (I also wiped all flye, then started from scratch, and same thing)

Thanks for the help. Regards James

mikolmogorov commented 4 years ago

Ok, my mistake - I forgot that when you install Flye (but not run from the local dir) - the git version would not show up.

So seems that version in the main branch did no fix the issue. Could you please try another version from flye-devel branch? This one contains many updates which might have fixed the issue, it is also better in detecting graph inconsistencies early.

P.S. You don't need to install Flye into the system (so you can have multiple versions). You can build Flye in the source dir using python setup.py build and then run bin/flye.

mikolmogorov commented 4 years ago

Also, feel free to do --resume-from repeat with the flye-devel version.

Psy-Fer commented 4 years ago

Okay thanks.

So I did git clone --single-branch --branch flye-devel https://github.com/fenderglass/Flye.git

when I try python setup.py build, I get the following error. (tried on python2/3 and different gcc versions)

Last pertinent bits

/home/jamfer/work/Flye_devel/src/bin/repeat.cpp:252: undefined reference to `OutputGenerator::outputDot(std::vector<UnbranchingPath, std::allocator<UnbranchingPath> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/home/jamfer/work/Flye_devel/src/bin/repeat.cpp:254: undefined reference to `OutputGenerator::outputFasta(std::vector<UnbranchingPath, std::allocator<UnbranchingPath> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
collect2: error: ld returned 1 exit status
make[1]: *** [flye-repeat] Error 1
make[1]: Leaving directory `/home/jamfer/work/Flye_devel/src'
make: *** [all] Error 2
Traceback (most recent call last):
  File "setup.py", line 38, in run
    subprocess.check_call(["make"])
  File "/home/jamfer/work/python363/lib/python3.6/subprocess.py", line 291, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['make']' returned non-zero exit status 2.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "setup.py", line 80, in <module>
    'install' : MakeInstall}
  File "/home/jamfer/work/venv363/lib/python3.6/site-packages/setuptools/__init__.py", line 145, in setup
    return distutils.core.setup(**attrs)
  File "/home/jamfer/work/python363/lib/python3.6/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/home/jamfer/work/python363/lib/python3.6/distutils/dist.py", line 955, in run_commands
    self.run_command(cmd)
  File "/home/jamfer/work/python363/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "setup.py", line 40, in run
    sys.exit("Compilation error: ", e)
TypeError: exit expected at most 1 arguments, got 2

Any ideas? Thanks again for your help

mikolmogorov commented 4 years ago

Thanks, looks like compilation error. In the source dir can you do: make clean, then make and post the entire terminal output? Also, post the output of g++ --version.

Psy-Fer commented 4 years ago

Hmm, that's weird. I tried that a few times before. Went and did some other tings, and tried again, and it all compiled this time. So then I ran python setup.py build

Full build output:

 make
make -C /home/jamfer/work/Flye/lib/minimap2
make[1]: Entering directory `/home/jamfer/work/Flye/lib/minimap2'
cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  main.c -o main.o
cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  kthread.c -o kthread.o
cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  kalloc.c -o kalloc.o
cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  misc.c -o misc.o
cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  bseq.c -o bseq.o
cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  sketch.c -o sketch.o
cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  sdust.c -o sdust.o
cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  options.c -o options.o
cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  index.c -o index.o
cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  chain.c -o chain.o
cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  align.c -o align.o
cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  hit.c -o hit.o
cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  map.c -o map.o
cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  format.c -o format.o
format.c: In function ‘mm_write_sam3’:
format.c:391:36: warning: variable ‘this_rev’ set but not used [-Wunused-but-set-variable]
  int this_rid = -1, this_pos = -1, this_rev = 0;
                                    ^~~~~~~~
cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  pe.c -o pe.o
cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  esterr.c -o esterr.o
cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  splitidx.c -o splitidx.o
cc -c -g -Wall -O2 -Wc++-compat  -msse2 -DHAVE_KALLOC  ksw2_ll_sse.c -o ksw2_ll_sse.o
cc -c -g -Wall -O2 -Wc++-compat  -msse4.1 -DHAVE_KALLOC -DKSW_CPU_DISPATCH  ksw2_extz2_sse.c -o ksw2_extz2_sse41.o
cc -c -g -Wall -O2 -Wc++-compat  -msse4.1 -DHAVE_KALLOC -DKSW_CPU_DISPATCH  ksw2_extd2_sse.c -o ksw2_extd2_sse41.o
cc -c -g -Wall -O2 -Wc++-compat  -msse4.1 -DHAVE_KALLOC -DKSW_CPU_DISPATCH  ksw2_exts2_sse.c -o ksw2_exts2_sse41.o
cc -c -g -Wall -O2 -Wc++-compat  -msse2 -mno-sse4.1 -DHAVE_KALLOC -DKSW_CPU_DISPATCH -DKSW_SSE2_ONLY  ksw2_extz2_sse.c -o ksw2_extz2_sse2.o
cc -c -g -Wall -O2 -Wc++-compat  -msse2 -mno-sse4.1 -DHAVE_KALLOC -DKSW_CPU_DISPATCH -DKSW_SSE2_ONLY  ksw2_extd2_sse.c -o ksw2_extd2_sse2.o
cc -c -g -Wall -O2 -Wc++-compat  -msse2 -mno-sse4.1 -DHAVE_KALLOC -DKSW_CPU_DISPATCH -DKSW_SSE2_ONLY  ksw2_exts2_sse.c -o ksw2_exts2_sse2.o
cc -c -g -Wall -O2 -Wc++-compat  -msse4.1 -DHAVE_KALLOC -DKSW_CPU_DISPATCH  ksw2_dispatch.c -o ksw2_dispatch.o
ar -csru libminimap2.a kthread.o kalloc.o misc.o bseq.o sketch.o sdust.o options.o index.o chain.o align.o hit.o map.o format.o pe.o esterr.o splitidx.o ksw2_ll_sse.o ksw2_extz2_sse41.o ksw2_extd2_sse41.o ksw2_exts2_sse41.o ksw2_extz2_sse2.o ksw2_extd2_sse2.o ksw2_exts2_sse2.o ksw2_dispatch.o
cc -g -Wall -O2 -Wc++-compat  main.o -o minimap2 -L. -lminimap2 -lm -lz -lpthread
make[1]: Leaving directory `/home/jamfer/work/Flye/lib/minimap2'
cp /home/jamfer/work/Flye/lib/minimap2/minimap2 /home/jamfer/work/Flye/bin/flye-minimap2
make release -C src
make[1]: Entering directory `/home/jamfer/work/Flye/src'
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG repeat_graph/multiplicity_inferer.cpp -o repeat_graph/multiplicity_inferer.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG repeat_graph/repeat_resolver.cpp -o repeat_graph/repeat_resolver.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG repeat_graph/read_aligner.cpp -o repeat_graph/read_aligner.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG repeat_graph/graph_processing.cpp -o repeat_graph/graph_processing.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG repeat_graph/repeat_graph.cpp -o repeat_graph/repeat_graph.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG repeat_graph/output_generator.cpp -o repeat_graph/output_generator.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG repeat_graph/haplotype_resolver.cpp -o repeat_graph/haplotype_resolver.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG sequence/sequence.cpp -o sequence/sequence.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG sequence/consensus_generator.cpp -o sequence/consensus_generator.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG sequence/vertex_index.cpp -o sequence/vertex_index.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG sequence/sequence_container.cpp -o sequence/sequence_container.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG sequence/overlap.cpp -o sequence/overlap.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG bin/repeat.cpp -o bin/repeat.o
g++ repeat_graph/multiplicity_inferer.o repeat_graph/repeat_resolver.o repeat_graph/read_aligner.o repeat_graph/graph_processing.o repeat_graph/repeat_graph.o repeat_graph/output_generator.o repeat_graph/haplotype_resolver.o sequence/sequence.o sequence/consensus_generator.o sequence/vertex_index.o sequence/sequence_container.o sequence/overlap.o bin/repeat.o -o /home/jamfer/work/Flye/bin/flye-repeat -lz -L/home/jamfer/work/Flye/lib/minimap2 -lminimap2 -pthread -std=c++11 -rdynamic
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG polishing/dinucleotide_fixer.cpp -o polishing/dinucleotide_fixer.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG polishing/homo_polisher.cpp -o polishing/homo_polisher.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG polishing/bubble_processor.cpp -o polishing/bubble_processor.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG polishing/alignment.cpp -o polishing/alignment.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG polishing/subs_matrix.cpp -o polishing/subs_matrix.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG polishing/general_polisher.cpp -o polishing/general_polisher.o
g++ -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG   -c -o bin/polisher.o bin/polisher.cpp
g++ polishing/dinucleotide_fixer.o polishing/homo_polisher.o polishing/bubble_processor.o polishing/alignment.o polishing/subs_matrix.o polishing/general_polisher.o bin/polisher.o -o /home/jamfer/work/Flye/bin/flye-polish -lz -L/home/jamfer/work/Flye/lib/minimap2 -lminimap2 -pthread -std=c++11 -rdynamic
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG assemble/parameters_estimator.cpp -o assemble/parameters_estimator.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG assemble/chimera.cpp -o assemble/chimera.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG assemble/extender.cpp -o assemble/extender.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG bin/assemble.cpp -o bin/assemble.o
g++ assemble/parameters_estimator.o assemble/chimera.o assemble/extender.o sequence/sequence.o sequence/consensus_generator.o sequence/vertex_index.o sequence/sequence_container.o sequence/overlap.o bin/assemble.o -o /home/jamfer/work/Flye/bin/flye-assemble -lz -L/home/jamfer/work/Flye/lib/minimap2 -lminimap2 -pthread -std=c++11 -rdynamic
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG contigger/contig_extender.cpp -o contigger/contig_extender.o
g++ -c -I/home/jamfer/work/Flye/lib/libcuckoo -I/home/jamfer/work/Flye/lib/interval_tree -I/home/jamfer/work/Flye/lib/lemon -I/home/jamfer/work/Flye/lib/minimap2 -Wall -Wextra -pthread -std=c++11 -g -O3 -DNDEBUG bin/contigger.cpp -o bin/contigger.o
g++ contigger/contig_extender.o repeat_graph/multiplicity_inferer.o repeat_graph/repeat_resolver.o repeat_graph/read_aligner.o repeat_graph/graph_processing.o repeat_graph/repeat_graph.o repeat_graph/output_generator.o repeat_graph/haplotype_resolver.o sequence/sequence.o sequence/consensus_generator.o sequence/vertex_index.o sequence/sequence_container.o sequence/overlap.o bin/contigger.o -o /home/jamfer/work/Flye/bin/flye-contigger -lz -L/home/jamfer/work/Flye/lib/minimap2 -lminimap2 -pthread -std=c++11 -rdynamic
make[1]: Leaving directory `/home/jamfer/work/Flye/src'
g++ --version
g++ (GCC) 8.2.0

I am now running /bin/flye with --resume-from repeat

[2019-12-03 20:28:37] root: INFO: Starting Flye 2.6-release
[2019-12-03 20:28:37] root: DEBUG: Cmd: /home/jamfer/work/Flye/bin/flye --threads 80 --asm-coverage 30 -g 3g --nano-raw /directflow/KCCGGenometechTemp/projects/jamfer/data/sample_pass.fastq --resume-from repeat --iterations 2 --out-dir asm
[2019-12-03 20:28:37] root: DEBUG: Python version: 3.6.3 (default, Nov 16 2017, 17:36:10) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-17)]
[2019-12-03 20:28:37] root: INFO: Resuming previous run
[2019-12-03 20:28:37] root: INFO: >>>STAGE: repeat
[2019-12-03 20:28:37] root: INFO: Building and resolving repeat graph
[2019-12-03 20:28:37] root: DEBUG: -----Begin repeat analyser log------
[2019-12-03 20:28:37] root: DEBUG: Running: flye-repeat --disjointigs /directflow/KCCGGenometechTemp/projects/jamfer/flye/asm/10-consensus/consensus.fasta --reads /directflow/KCCGGenometechTemp/projects/jamfer/data/sample_pass.fastq --out-dir /directflow/KCCGGenometechTemp/projects/jamfer/flye/asm/20-repeat --config /home/jamfer/work/Flye/flye/config/bin_cfg/asm_raw_reads.cfg --log /directflow/KCCGGenometechTemp/projects/jamfer/flye/asm/flye.log --threads 80 --min-ovlp 1000 --kmer 17
[2019-12-03 20:28:37] DEBUG: Build date: Dec  3 2019 20:23:12
[2019-12-03 20:28:37] DEBUG: Total RAM: 3021 Gb
[2019-12-03 20:28:37] DEBUG: Available RAM: 2998 Gb
[2019-12-03 20:28:37] DEBUG: Total CPUs: 144
[2019-12-03 20:28:37] DEBUG: Parameters:
[2019-12-03 20:28:37] DEBUG:    big_genome_threshold=29000000
[2019-12-03 20:28:37] DEBUG:    low_cutoff_warning=1
[2019-12-03 20:28:37] DEBUG:    hard_min_coverage_rate=10
[2019-12-03 20:28:37] DEBUG:    assemble_kmer_sample=1
[2019-12-03 20:28:37] DEBUG:    repeat_graph_kmer_sample=1
[2019-12-03 20:28:37] DEBUG:    read_align_kmer_sample=1
[2019-12-03 20:28:37] DEBUG:    meta_read_top_kmer_rate=0.25
[2019-12-03 20:28:37] DEBUG:    meta_read_filter_kmer_freq=10
[2019-12-03 20:28:37] DEBUG:    maximum_jump=1500
[2019-12-03 20:28:37] DEBUG:    maximum_overhang=1500
[2019-12-03 20:28:37] DEBUG:    repeat_kmer_rate=100
[2019-12-03 20:28:37] DEBUG:    assemble_ovlp_relative_divergence=0.10
[2019-12-03 20:28:37] DEBUG:    repeat_graph_ovlp_divergence=0.15
[2019-12-03 20:28:37] DEBUG:    read_align_ovlp_divergence=0.25
[2019-12-03 20:28:37] DEBUG:    max_coverage_drop_rate=5
[2019-12-03 20:28:37] DEBUG:    chimera_window=100
[2019-12-03 20:28:37] DEBUG:    min_reads_in_disjointig=4
[2019-12-03 20:28:37] DEBUG:    max_inner_reads=10
[2019-12-03 20:28:37] DEBUG:    max_inner_fraction=0.25
[2019-12-03 20:28:37] DEBUG:    add_unassembled_reads=0
[2019-12-03 20:28:37] DEBUG:    max_separation=500
[2019-12-03 20:28:37] DEBUG:    unique_edge_length=50000
[2019-12-03 20:28:37] DEBUG:    min_repeat_res_support=0.51
[2019-12-03 20:28:37] DEBUG:    out_paths_ratio=5
[2019-12-03 20:28:37] DEBUG:    graph_cov_drop_rate=5
[2019-12-03 20:28:37] DEBUG:    coverage_estimate_window=100
[2019-12-03 20:28:37] DEBUG:    extend_contigs_with_repeats=1
[2019-12-03 20:28:37] DEBUG:    min_read_cov_cutoff=3
[2019-12-03 20:28:37] DEBUG:    short_tip_length=20000
[2019-12-03 20:28:37] DEBUG:    long_tip_length=100000
[2019-12-03 20:28:37] DEBUG:    max_bubble_length=50000
[2019-12-03 20:28:37] DEBUG: Running with k-mer size: 17
[2019-12-03 20:28:37] DEBUG: Selected minimum overlap 1000
[2019-12-03 20:28:37] DEBUG: Metagenome mode: N
[2019-12-03 20:28:37] INFO: Reading sequences

Can't really tell if it's working with the new or not as the versions are not updated.... fingers crossed?

Psy-Fer commented 4 years ago

Hmm, I did just notice, I am not loading in GCC 8.2.0 into my job, and it's using [GCC 4.4.7 20120313 (Red Hat 4.4.7-17)]

Would that be an issue? (going to restart and load that up just to remove it as a problem)

mikolmogorov commented 4 years ago

As long as it has started - it shouldn't be an issue, I think. Let me know how it goes.

P.S. Are you sure you're building the devel version though? The folder name was Flye_devel in your previous post, and now it's Flye. Just making sure :)

Psy-Fer commented 4 years ago

Hey,

Yea, i just deleted and re downloaded the branch to keep things simple. I'll let you know. Might take a while, this genome is pretty repetitive.

dcopetti commented 4 years ago

Hello, related to issue #192, I tried downloading the flye-devel branch with git clone --single-branch --branch flye-devel https://github.com/fenderglass/Flye.git and I have two types of issues. The make command gives an error with the zlib.h file:

(base) bash-4.2$ make
make -C /home/copettid/bin/Flye_second/lib/minimap2
make[1]: Entering directory '/home/copettid/bin/Flye_second/lib/minimap2'
/home/copettid/anaconda3/bin/x86_64-conda_cos6-linux-gnu-cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  bseq.c -o bseq.o
bseq.c:1:10: fatal error: zlib.h: No such file or directory
 #include <zlib.h>
          ^~~~~~~~
compilation terminated.
make[1]: *** [Makefile:29: bseq.o] Error 1
make[1]: Leaving directory '/home/copettid/bin/Flye_second/lib/minimap2'
make: *** [Makefile:18: /home/copettid/bin/Flye_second/bin/flye-minimap2] Error 2

though the library was downloaded and is in /usr/include/zlib.h, and that folder is included in our path:

(base) bash-4.2$ echo $PATH
/usr/include:/home/copettid/anaconda3/bin:/home/copettid/anaconda3/condabin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/puppetlabs/bin:/usr/include

But the python script works:

(base) bash-4.2$ bin/flye
usage: flye (--pacbio-raw | --pacbio-corr | --nano-raw |
             --nano-corr | --subassemblies) file1 [file_2 ...]
             --genome-size SIZE --out-dir PATH

             [--threads int] [--iterations int] [--min-overlap int]
             [--meta] [--plasmids] [--no-trestle] [--polish-target]
             [--keep-haplotypes] [--debug] [--version] [--help]
             [--resume] [--resume-from] [--stop-after]
flye: error: the following arguments are required: -o/--out-dir

and can even start an assembly (here with a small subset of reads): python /home/copettid/bin/Flye_second/bin/flye --nano-raw test.fq --genome-size 5g --out-dir test -t 40 --min-overlap 10000 but it dies with:

[2019-12-05 12:14:35] INFO: Resolving repeats
[2019-12-05 12:14:36] INFO: >>>STAGE: trestle
Traceback (most recent call last):
  File "/home/copettid/bin/Flye_first/bin/flye", line 25, in <module>
    sys.exit(main())
  File "/home/copettid/bin/Flye_first/flye/main.py", line 767, in main
    _run(args)
  File "/home/copettid/bin/Flye_first/flye/main.py", line 566, in _run
    jobs[i].run()
  File "/home/copettid/bin/Flye_first/flye/main.py", line 403, in run
    repeat_graph.load_from_file(self.repeat_graph)
  File "/home/copettid/bin/Flye_first/flye/repeat_graph/repeat_graph.py", line 138, in load_from_file
    self_complement, resolved, mean_coverage, alt_group) = tokens[1:]
ValueError: not enough values to unpack (expected 8, got 7)

(at the same step as the previous run died, I think). The tail of the log is:

[2019-12-05 12:37:41] DEBUG: Short-loop: 37
[2019-12-05 12:37:41] DEBUG: Repeat detection iteration 1
[2019-12-05 12:37:41] DEBUG: Starting 88 aln:63 minSpan:15867
[2019-12-05 12:37:41] DEBUG:        T 13        num:2   flank:3737      span:16546
[2019-12-05 12:37:41] DEBUG:        T -91       num:3   flank:2279      span:14112
[2019-12-05 12:37:41] DEBUG: Mult: -88  31811   31       (1,0)
[2019-12-05 12:37:41] DEBUG: Fixed: -80 -> 98 -> 81 -> 11       83198   44
[2019-12-05 12:37:41] DEBUG: Writing Dot
[2019-12-05 12:37:41] DEBUG: Writing FASTA
[2019-12-05 12:37:41] DEBUG: Peak RAM usage: 12 Gb
-----------End assembly log------------
[2019-12-05 12:37:41] root: INFO: >>>STAGE: trestle

and, surprisingly, the log file shows the -release version:

[2019-12-05 12:27:53] root: INFO: Starting Flye 2.6-release
[2019-12-05 12:27:53] root: DEBUG: Cmd: /home/copettid/bin/Flye_second/bin/flye --nano-raw test.fq --genome-size 5g --out-dir test2 -t 40 --min-overlap 10000
[2019-12-05 12:27:53] root: DEBUG: Python version: 3.7.3 (default, Mar 27 2019, 22:11:17)
[GCC 7.3.0]
[2019-12-05 12:27:53] root: INFO: >>>STAGE: configure
[2019-12-05 12:27:53] root: INFO: Configuring run

Now, I wonder if the failed assembly is due to a compilation error, or to a bug present in the release rather than in the branch version. Also, how come I did not download the -devel version with git? Regarding the zlib.h issue, how does Flye find systems files and headers? @Psy-Fer, how were you able to install the version from the branch? Thanks, Dario

mikolmogorov commented 4 years ago

@dcopetti

The binaries from flye-devel did not compile - and the python pipeline probably have picked up the old binaries that were previously installed into your system. And you are getting the error because the output formats of some binaries have changed a bit. You can double check this in the beginning of the assembly log - it should report the build date for each binaries.

To compile Flye, you need the development headers for zlib. Installation process depends on your system (for Linux, you typically need to install an extra package). Simply copying *.h files likely will not work.

Don't worry about 2.6-release tag - it seems like it is misbehaving on some systems. I will try to fix it in the future.

mikolmogorov commented 4 years ago

Btw the latest updates are now also in the main github branch (flye) - feel free to use this one as well.

dcopetti commented 4 years ago

Hi Mikhail, I downloaded the main Flye branch with git,I still get the same error as before and especially, the compilation date is the one of September: [2019-12-06 08:44:27] DEBUG: Build date: Sep 19 2019 20:22:15 same for the -devel version: git clone --single-branch --branch flye-devel https://github.com/fenderglass/Flye.git maybe your updates need to be pushed?

$ git pull
Already up-to-date.

Thanks

mikolmogorov commented 4 years ago

Please run the commands below in this exact order and post the full output of each command

cd ~
rm -rf Flye
which flye
which flye-assemble
git clone https://github.com/fenderglass/Flye.git
cd Flye
make
bin/flye --version
bin/flye --pacbio-raw flye/tests/data/ecoli_500kb_reads.fastq.gz -g 500k -o test
dcopetti commented 4 years ago

Hi, Here they are:

(base) bash-4.2$ cd ~
(base) bash-4.2$ pwd
/home/copettid
(base) bash-4.2$ cd bin
(base) bash-4.2$ ls -lrth
total 408M
drwxr-xr-x. 2 copettid mpb 4.0K Mar 14  2019 timeout-master
-rwx------. 1 copettid mpb 408M Sep 19 09:58 Anaconda2-2019.07-MacOSX-x86_64.sh
-rwx------. 1 copettid mpb  11K Oct 24 16:11 timeout-master.zip
drwx------. 3 copettid mpb 4.0K Oct 30 10:48 mapping_pipeline
drwx------. 9 copettid mpb 4.0K Oct 30 12:06 picard
drwx------. 9 copettid mpb 4.0K Dec  4 09:56 Flye_first
drwxr-xr-x. 8 copettid mpb 4.0K Dec  5 09:49 Flye_second
drwxr-xr-x. 9 copettid mpb 4.0K Dec  6 08:17 Flye_six
drwxr-xr-x. 8 copettid mpb 4.0K Dec  6 08:43 Flye_sixb
drwxr-xr-x. 8 copettid mpb 4.0K Dec  6 08:57 Flye
(base) bash-4.2$ rm -rf Flye
(base) bash-4.2$ which flye
/home/copettid/anaconda3/bin/flye
(base) bash-4.2$ which flye-assemble
/home/copettid/anaconda3/bin/flye-assemble
(base) bash-4.2$ git clone https://github.com/fenderglass/Flye.git
Cloning into 'Flye'...
remote: Enumerating objects: 204, done.
remote: Counting objects: 100% (204/204), done.
remote: Compressing objects: 100% (131/131), done.
remote: Total 13639 (delta 135), reused 125 (delta 73), pack-reused 13435
Receiving objects: 100% (13639/13639), 17.93 MiB | 12.92 MiB/s, done.
Resolving deltas: 100% (9935/9935), done.
(base) bash-4.2$ cd Flye
(base) bash-4.2$ make
make -C /home/copettid/bin/Flye/lib/minimap2
make[1]: Entering directory '/home/copettid/bin/Flye/lib/minimap2'
/home/copettid/anaconda3/bin/x86_64-conda_cos6-linux-gnu-cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  main.c -o main.o
/home/copettid/anaconda3/bin/x86_64-conda_cos6-linux-gnu-cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  kthread.c -o kthread.o
/home/copettid/anaconda3/bin/x86_64-conda_cos6-linux-gnu-cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  kalloc.c -o kalloc.o
/home/copettid/anaconda3/bin/x86_64-conda_cos6-linux-gnu-cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  misc.c -o misc.o
/home/copettid/anaconda3/bin/x86_64-conda_cos6-linux-gnu-cc -c -g -Wall -O2 -Wc++-compat  -DHAVE_KALLOC  bseq.c -o bseq.o
bseq.c:1:10: fatal error: zlib.h: No such file or directory
 #include <zlib.h>
          ^~~~~~~~
compilation terminated.
make[1]: *** [Makefile:29: bseq.o] Error 1
make[1]: Leaving directory '/home/copettid/bin/Flye/lib/minimap2'
make: *** [Makefile:18: /home/copettid/bin/Flye/bin/flye-minimap2] Error 2
(base) bash-4.2$ bin/flye --version
2.6-release
(base) bash-4.2$ bin/flye --pacbio-raw flye/tests/data/ecoli_500kb_reads.fastq.gz -g 500k -o test
[...]
[2019-12-06 20:26:22] INFO: >>>STAGE: consensus
[2019-12-06 20:26:22] INFO: Running Minimap2
[2019-12-06 20:26:29] INFO: Computing consensus
[2019-12-06 20:26:39] INFO: Alignment error rate: 0.196733
[2019-12-06 20:26:39] INFO: >>>STAGE: repeat
[2019-12-06 20:26:39] INFO: Building and resolving repeat graph
[2019-12-06 20:26:39] INFO: Reading sequences
[2019-12-06 20:26:40] INFO: Building repeat graph
50% 100%
[2019-12-06 20:26:57] INFO: Median overlap divergence: 0
[2019-12-06 20:26:57] INFO: Aligning reads to the graph
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2019-12-06 20:27:21] INFO: Aligned read sequence: 7626749 / 7743943 (0.984866)
[2019-12-06 20:27:21] INFO: Median overlap divergence: 0.133219
[2019-12-06 20:27:21] INFO: Mean edge coverage: 17
[2019-12-06 20:27:21] INFO: Resolving repeats
[2019-12-06 20:27:21] INFO: >>>STAGE: trestle
Traceback (most recent call last):
  File "bin/flye", line 25, in <module>
    sys.exit(main())
  File "/home/copettid/bin/Flye/flye/main.py", line 767, in main
    _run(args)
  File "/home/copettid/bin/Flye/flye/main.py", line 566, in _run
    jobs[i].run()
  File "/home/copettid/bin/Flye/flye/main.py", line 403, in run
    repeat_graph.load_from_file(self.repeat_graph)
  File "/home/copettid/bin/Flye/flye/repeat_graph/repeat_graph.py", line 138, in load_from_file
    self_complement, resolved, mean_coverage, alt_group) = tokens[1:]
ValueError: not enough values to unpack (expected 8, got 7)

(base) bash-4.2$ ls -lrth test/
total 40K
drwxr-xr-x. 2 copettid mpb 4.0K Dec  6 20:26 00-assembly
drwxr-xr-x. 2 copettid mpb 4.0K Dec  6 20:26 10-consensus
drwxr-xr-x. 2 copettid mpb 4.0K Dec  6 20:27 20-repeat
-rw-r--r--. 1 copettid mpb  108 Dec  6 20:27 params.json
-rw-r--r--. 1 copettid mpb  20K Dec  6 20:27 flye.log
drwxr-xr-x. 2 copettid mpb 4.0K Dec  6 20:27 21-trestle
(base) bash-4.2$ ls -lrth test/20-repeat/
total 1008K
-rw-r--r--. 1 copettid mpb  367 Dec  6 20:27 graph_before_rr.gv
-rw-r--r--. 1 copettid mpb  177 Dec  6 20:27 repeat_graph_dump
-rw-r--r--. 1 copettid mpb  367 Dec  6 20:27 graph_after_rr.gv
-rw-r--r--. 1 copettid mpb 162K Dec  6 20:27 read_alignment_dump
-rw-r--r--. 1 copettid mpb 415K Dec  6 20:27 repeat_graph_edges.fasta
-rw-r--r--. 1 copettid mpb 415K Dec  6 20:27 graph_before_rr.fasta
(base) bash-4.2$ ls -lrth test/21-trestle/
total 0

also, this is the whole log file: flye.log the build date is still September 2019

What I am actually puzzled about is that once I was able to complete an assembly with a the >8 kb dataset, since I started using the >5 kb set it dies. And also if I use small genomes, like with this E. coli. The error you see here is exactly the same I get with my data. Thanks for the help,

Dario

mikolmogorov commented 4 years ago

As you can see, there is a compilation error, and the updated binaries were not compiled. You should ensure that make command finishes without errors before running Flye. Otherwise, your old binaries from bioconda installations are being used - I suggest to disable the environment / delete the flye bioconda package for now.

cabbagesofdoom commented 4 years ago

Hi. I've been having the same problem. Any chance of making an intermediate release of the current flye branch? The SysAdmins for our HPC have a policy of not installing anything that isn't a "release". Thanks.

mikolmogorov commented 4 years ago

@cabbagesofdoom Could you also describe your dataset and post the log file?

You don't need admin rights to build and run Flye from source - just follow the instructions in the INSTALL.md. Let me know if you have any issues. Before making a release, I first need to make sure that the problem is indeed fixed.

cabbagesofdoom commented 4 years ago

Thanks @fenderglass. The log file is 134Mb! Do you want the whole thing? (Perhaps I can share directly with you through FileSender?) The last few lines are below, in case this is enough.

The dataset itself is a mixture of PacBio and ONT data for the cane toad. In total it's about 125 Gb data (~81 Gb PacBio and ~44 Gb ONT) and I think the genome size is probably around 3.5 Gb. It's quite a repeat-rich genome and doesn't tend to assemble well.

This is the last few lines of the log:

[2019-12-05 12:05:13] DEBUG: Starting 282198 aln:24 minSpan:0
[2019-12-05 12:05:13] DEBUG:          -36941    num:8   flank:5279      span:52
[2019-12-05 12:05:13] DEBUG:          238694    num:12  flank:6558      span:28
[2019-12-05 12:05:13] DEBUG: Mult: -282198 -> 238695 -> 310477 -> -250115 -> 170400 -> -312511 -> 167834 -> 319954 -> -169019 -> 250484 -> -307636 -> 76050 -> 288727 -> -240518 -> -330706 -> -237102 -> -307742 -> -137082 -> 156364 -> 107753 -> -149695 -> -240026 -> -75353 -> 235434 -> -320692 -> 282741 -> -308548 -> -238697 -> -324403 -> 248402 -> -90620 -> 299667 -> 162714 -> 328984 -> -279367 -> -289266 -> 282751 -> -325462 -> 281671 -> 319246 -> -266830 -> -281323 -> 90231        2901155 24       (1,0)
[2019-12-05 12:05:30] DEBUG: Removed 6 simple and 8 double chimeric junctions
[2019-12-05 12:06:10] ERROR: Segmentation fault! Backtrace:
[2019-12-05 12:06:10] ERROR:    flye-repeat(_Z15segfaultHandleri+0x1e) [0x4c067e]
[2019-12-05 12:06:10] ERROR:    /lib64/libc.so.6(+0x363b0) [0x2b6ecbe313b0]
[2019-12-05 12:06:10] ERROR:    flye-repeat(_Z9vecRemoveIP9GraphEdgeEvRSt6vectorIT_SaIS3_EES3_+0x27) [0x4423c7]
[2019-12-05 12:06:10] ERROR:    flye-repeat(_ZN19MultiplicityInferer22removeUnsupportedEdgesEv+0x194) [0x439ac4]
[2019-12-05 12:06:10] ERROR:    flye-repeat(main+0xa65) [0x4363a5]
[2019-12-05 12:06:10] ERROR:    /lib64/libc.so.6(__libc_start_main+0xf5) [0x2b6ecbe1d505]
[2019-12-05 12:06:10] ERROR:    flye-repeat() [0x436b9f]
[2019-12-05 12:06:17] root: ERROR: Command '['flye-repeat', '--disjointigs', '/srv/scratch/canetoad/CaneToad-May15/assemblies/2019-11-28.Flye2.6/flye_all/10-consensus/consensus.fasta', '--reads', '/srv/scratch/canetoad/CaneToad-May15/raw/pacbio/fasta/canetoad.6pblib.subreads.fasta,/srv/scratch/canetoad/CaneToad-May15/data/2019-11-15.ONT/canetoad.ont.2019-11-15.500bp.pass.fasta,/srv/scratch/canetoad/CaneToad-May15/data/2019-11-15.ONT/canetoad.ont.2019-11-15.500bp.fail.fasta', '--out-dir', '/srv/scratch/canetoad/CaneToad-May15/assemblies/2019-11-28.Flye2.6/flye_all/20-repeat', '--config', '/apps/flye/2.6/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg', '--log', '/srv/scratch/canetoad/CaneToad-May15/assemblies/2019-11-28.Flye2.6/flye_all/flye.log', '--threads', '24', '--min-ovlp', '4000', '--kmer', '17']' died with <Signals.SIGABRT: 6>.
mikolmogorov commented 4 years ago

Thanks! I would suggest to build the latest github version from source, as described here: https://github.com/fenderglass/Flye/blob/flye/docs/INSTALL.md#local-building-without-installation

It should not require admin rights. Feel free to do --resume-from repeat to save on assembly time.

Psy-Fer commented 4 years ago

So, an update, getting a little closer.

[2019-12-03 20:40:06] INFO: Starting Flye 2.6-release
[2019-12-03 20:40:07] INFO: Resuming previous run
[2019-12-03 20:40:07] INFO: >>>STAGE: repeat
[2019-12-03 20:40:07] INFO: Building and resolving repeat graph
[2019-12-03 20:40:07] INFO: Reading sequences
[2019-12-03 20:49:29] INFO: Building repeat graph
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2019-12-04 20:49:21] INFO: Median overlap divergence: 0.0695221
[2019-12-12 22:51:23] INFO: Aligning reads to the graph
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2019-12-13 10:27:45] INFO: Aligned read sequence: 75879551270 / 114451191639 (0.662986)
[2019-12-13 10:27:45] INFO: Median overlap divergence: 0.0960807
[2019-12-13 10:29:07] INFO: Mean edge coverage: 52
[2019-12-13 10:29:21] INFO: Simplifying the graph

That only took 8 days. hahaha.

And it got past the "Writing Dot" step

[2019-12-13 10:29:17] DEBUG: Unique coverage threshold 80
[2019-12-13 10:29:18] DEBUG: Writing Dot
[2019-12-13 10:29:21] INFO: Simplifying the graph
[2019-12-13 10:29:26] DEBUG: Read coverage cutoff: 10
[2019-12-13 10:29:29] DEBUG: [SIMPL] Removed 5052 paths with low coverage

Hopefully it can get to the end this time. I'm really excited if this work.

I'll let you know when it's done.

Psy-Fer commented 4 years ago

Hey @cabbagesofdoom Looks like your run ended with an abort signal 6, triggered by a seg fault. Which GCC are you using?

cabbagesofdoom commented 4 years ago

@Psy-Fer I'm not actually sure, as I just load the module installed by the sys admin. The latest one on the system is gcc/7.3.0.

mikolmogorov commented 4 years ago

@Psy-Fer any updates?

Psy-Fer commented 4 years ago

Hey,

Xmas was pretty lame for cluster uptime. So although it seemed to get past that stage, I never got to see it finish the repeat step before it was killed. Restarted a week ago, so will probably have to wait another few days to see if it gets past that step again, and if it gets all the way to the end of the repeat step. It has to start the repeat step from the start every time, and it takes a loooooong time.

I'll update when I have news.

Thanks

cabbagesofdoom commented 4 years ago

The sys admin here installed the latest github version before Christmas (despite saying they wouldn't!) and I was able to set it going again from the repeat stage, but it died again at the contigger step:

[2019-12-23 09:19:34] INFO: Starting Flye 2.6-release
[2019-12-23 09:19:34] INFO: Resuming previous run
[2019-12-23 09:19:34] INFO: >>>STAGE: repeat
[2019-12-23 09:19:34] INFO: Building and resolving repeat graph
[2019-12-23 09:19:34] INFO: Reading sequences
...
[2019-12-24 07:59:46] DEBUG: Building positional index
[2019-12-24 07:59:52] DEBUG: Total sequence: 125113900766 bp
[2019-12-24 08:00:00] WARNING: Edge 196386 not paired
[2019-12-24 08:00:00] WARNING: Edge 186592 not paired
[2019-12-24 08:00:00] WARNING: Edge 196386 not paired
[2019-12-24 08:00:00] WARNING: Edge 186592 not paired
[2019-12-24 08:00:00] WARNING: Edge 29303 brakes symmetry
[2019-12-24 08:00:00] WARNING: Edge 140494 brakes symmetry
[2019-12-24 08:00:00] ERROR: Caught unhandled exception: _Map_base::at
[2019-12-24 08:00:00] ERROR:    flye-contigger(_Z16exceptionHandlerv+0xcd) [0x4e3bad]
[2019-12-24 08:00:00] ERROR:    /apps/gcc/4.9.4/lib64/libstdc++.so.6(+0x5db86) [0x2b1633a03b86]
[2019-12-24 08:00:00] ERROR:    /apps/gcc/4.9.4/lib64/libstdc++.so.6(+0x5dbd1) [0x2b1633a03bd1]
[2019-12-24 08:00:00] ERROR:    /apps/gcc/4.9.4/lib64/libstdc++.so.6(+0x5dde9) [0x2b1633a03de9]
[2019-12-24 08:00:00] ERROR:    /apps/gcc/4.9.4/lib64/libstdc++.so.6(_ZSt20__throw_out_of_rangePKc+0x66) [0x2b1633a59dd6]
[2019-12-24 08:00:00] ERROR:    flye-contigger(_ZNK11RepeatGraph14complementEdgeEP9GraphEdge+0xd8) [0x48cb48]
[2019-12-24 08:00:00] ERROR:    flye-contigger(_ZN11RepeatGraph13validateGraphEv+0x8e6) [0x48da56]
[2019-12-24 08:00:00] ERROR:    flye-contigger(main+0x8c1) [0x43aa21]
[2019-12-24 08:00:00] ERROR:    /lib64/libc.so.6(__libc_start_main+0xf5) [0x2b1634407505]
[2019-12-24 08:00:00] ERROR:    flye-contigger() [0x43b51f]
[2019-12-24 08:00:02] root: ERROR: Command '['flye-contigger', '--graph-edges', '/srv/scratch/canetoad/CaneToad-May15/assemblies/2019-11-28.Flye2.6/flye_all/21-trestle/repeat_graph_edges.fasta', '--reads', '/srv/scratch/canetoad/CaneToad-May15/raw/pacbio/fasta/canetoad.6pblib.subreads.fasta,/srv/scratch/canetoad/CaneToad-May15/data/2019-11-15.ONT/canetoad.ont.2019-11-15.500bp.pass.fasta,/srv/scratch/canetoad/CaneToad-May15/data/2019-11-15.ONT/canetoad.ont.2019-11-15.500bp.fail.fasta', '--out-dir', '/srv/scratch/canetoad/CaneToad-May15/assemblies/2019-11-28.Flye2.6/flye_all/30-contigger', '--config', '/apps/flye/20191215/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg', '--repeat-graph', '/srv/scratch/canetoad/CaneToad-May15/assemblies/2019-11-28.Flye2.6/flye_all/21-trestle/repeat_graph_dump', '--graph-aln', '/srv/scratch/canetoad/CaneToad-May15/assemblies/2019-11-28.Flye2.6/flye_all/20-repeat/read_alignment_dump', '--log', '/srv/scratch/canetoad/CaneToad-May15/assemblies/2019-11-28.Flye2.6/flye_all/flye.log', '--threads', '24', '--min-ovlp', '4000', '--kmer', '17']' died with <Signals.SIGABRT: 6>.

@Psy-Fer, I'm assuming from the errors that it's using GCC v4.9.4, but I'm afraid I'm ignorant as to the importance of this observation.

I now have a 21-trestle directory and an empty 30-contigger directory. Is it worth trying again with --resume to see if it dies in the same way but with a new log file?

mikolmogorov commented 4 years ago

@cabbagesofdoom thanks! Well, looks like the problem is still there :(

Could you please send me the full log again? This time it should contain more info on a possible cause, so hopefully I can figure out what is happening.

cabbagesofdoom commented 4 years ago

@fenderglass, I re-ran:

flye --pacbio-raw $PBREADS $ONTPASS $ONTFAIL --out-dir flye_all --genome-size 3.5g --threads 24 --resume

This is the full log:

[2020-01-12 20:27:23] root: INFO: Starting Flye 2.6-release
[2020-01-12 20:27:23] root: DEBUG: Cmd: /apps/flye/20191215/bin/flye --pacbio-raw /srv/scratch/canetoad/CaneToad-May15/raw/pacbio/fasta/canetoad.6pblib.subreads.fasta /srv/scratch/canetoad/CaneToad-May15/data/2019-11-15.ONT/canetoad.ont.2019-11-15.500bp.pass.fasta
 /srv/scratch/canetoad/CaneToad-May15/data/2019-11-15.ONT/canetoad.ont.2019-11-15.500bp.fail.fasta --out-dir flye_all --genome-size 3.5g --threads 24 --resume
[2020-01-12 20:27:23] root: DEBUG: Python version: 3.7.4 (default, Oct  4 2019, 15:02:56) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]
[2020-01-12 20:27:23] root: INFO: Resuming previous run
[2020-01-12 20:27:23] root: INFO: >>>STAGE: contigger
[2020-01-12 20:27:23] root: INFO: Generating contigs
[2020-01-12 20:27:23] root: DEBUG: -----Begin contigger analyser log------
[2020-01-12 20:27:23] root: DEBUG: Running: flye-contigger --graph-edges /srv/scratch/canetoad/CaneToad-May15/assemblies/2019-11-28.Flye2.6/flye_all/21-trestle/repeat_graph_edges.fasta --reads /srv/scratch/canetoad/CaneToad-May15/raw/pacbio/fasta/canetoad.6pblib.s
ubreads.fasta,/srv/scratch/canetoad/CaneToad-May15/data/2019-11-15.ONT/canetoad.ont.2019-11-15.500bp.pass.fasta,/srv/scratch/canetoad/CaneToad-May15/data/2019-11-15.ONT/canetoad.ont.2019-11-15.500bp.fail.fasta --out-dir /srv/scratch/canetoad/CaneToad-May15/assembl
ies/2019-11-28.Flye2.6/flye_all/30-contigger --config /apps/flye/20191215/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg --repeat-graph /srv/scratch/canetoad/CaneToad-May15/assemblies/2019-11-28.Flye2.6/flye_all/21-trestle/repeat_graph_dump --gr
aph-aln /srv/scratch/canetoad/CaneToad-May15/assemblies/2019-11-28.Flye2.6/flye_all/20-repeat/read_alignment_dump --log /srv/scratch/canetoad/CaneToad-May15/assemblies/2019-11-28.Flye2.6/flye_all/flye.log --threads 24 --min-ovlp 4000 --kmer 17
[2020-01-12 20:27:24] DEBUG: Build date: Dec 16 2019 11:52:50
[2020-01-12 20:27:24] DEBUG: Total RAM: 1007 Gb
[2020-01-12 20:27:24] DEBUG: Available RAM: 992 Gb
[2020-01-12 20:27:24] DEBUG: Total CPUs: 28
[2020-01-12 20:27:24] DEBUG: Parameters:
[2020-01-12 20:27:24] DEBUG:    big_genome_threshold=29000000
[2020-01-12 20:27:24] DEBUG:    low_cutoff_warning=1
[2020-01-12 20:27:24] DEBUG:    hard_min_coverage_rate=10
[2020-01-12 20:27:24] DEBUG:    assemble_kmer_sample=1
[2020-01-12 20:27:24] DEBUG:    repeat_graph_kmer_sample=1
[2020-01-12 20:27:24] DEBUG:    read_align_kmer_sample=1
[2020-01-12 20:27:24] DEBUG:    meta_read_top_kmer_rate=0.25
[2020-01-12 20:27:24] DEBUG:    meta_read_filter_kmer_freq=10
[2020-01-12 20:27:24] DEBUG:    maximum_jump=1500
[2020-01-12 20:27:24] DEBUG:    maximum_overhang=1500
[2020-01-12 20:27:24] DEBUG:    repeat_kmer_rate=100
[2020-01-12 20:27:24] DEBUG:    assemble_ovlp_relative_divergence=0.10
[2020-01-12 20:27:24] DEBUG:    repeat_graph_ovlp_divergence=0.15
[2020-01-12 20:27:24] DEBUG:    read_align_ovlp_divergence=0.25
[2020-01-12 20:27:24] DEBUG:    max_coverage_drop_rate=5
[2020-01-12 20:27:24] DEBUG:    chimera_window=100
[2020-01-12 20:27:24] DEBUG:    min_reads_in_disjointig=4
[2020-01-12 20:27:24] DEBUG:    max_inner_reads=10
[2020-01-12 20:27:24] DEBUG:    max_inner_fraction=0.25
[2020-01-12 20:27:24] DEBUG:    add_unassembled_reads=0
[2020-01-12 20:27:24] DEBUG:    max_separation=500
[2020-01-12 20:27:24] DEBUG:    unique_edge_length=50000
[2020-01-12 20:27:24] DEBUG:    min_repeat_res_support=0.51
[2020-01-12 20:27:24] DEBUG:    out_paths_ratio=5
[2020-01-12 20:27:24] DEBUG:    graph_cov_drop_rate=5
[2020-01-12 20:27:24] DEBUG:    coverage_estimate_window=100
[2020-01-12 20:27:24] DEBUG:    extend_contigs_with_repeats=1
[2020-01-12 20:27:24] DEBUG:    min_read_cov_cutoff=3
[2020-01-12 20:27:24] DEBUG:    short_tip_length=20000
[2020-01-12 20:27:24] DEBUG:    long_tip_length=100000
[2020-01-12 20:27:24] DEBUG:    max_bubble_length=50000
[2020-01-12 20:27:24] DEBUG: Running with k-mer size: 17
[2020-01-12 20:27:24] DEBUG: Selected minimum overlap 4000
[2020-01-12 20:27:24] INFO: Reading sequences
[2020-01-12 20:44:50] DEBUG: Building positional index
[2020-01-12 20:44:56] DEBUG: Total sequence: 125113900766 bp
[2020-01-12 20:45:03] WARNING: Edge 196386 not paired
[2020-01-12 20:45:03] WARNING: Edge 186592 not paired
[2020-01-12 20:45:03] WARNING: Edge 196386 not paired
[2020-01-12 20:45:03] WARNING: Edge 186592 not paired
[2020-01-12 20:45:03] WARNING: Edge 29303 brakes symmetry
[2020-01-12 20:45:03] WARNING: Edge 140494 brakes symmetry
[2020-01-12 20:45:03] ERROR: Caught unhandled exception: _Map_base::at
[2020-01-12 20:45:03] ERROR:    flye-contigger(_Z16exceptionHandlerv+0xcd) [0x4e3bad]
[2020-01-12 20:45:03] ERROR:    /apps/gcc/4.9.4/lib64/libstdc++.so.6(+0x5db86) [0x2ba59ddf8b86]
[2020-01-12 20:45:03] ERROR:    /apps/gcc/4.9.4/lib64/libstdc++.so.6(+0x5dbd1) [0x2ba59ddf8bd1]
[2020-01-12 20:45:03] ERROR:    /apps/gcc/4.9.4/lib64/libstdc++.so.6(+0x5dde9) [0x2ba59ddf8de9]
[2020-01-12 20:45:03] ERROR:    /apps/gcc/4.9.4/lib64/libstdc++.so.6(_ZSt20__throw_out_of_rangePKc+0x66) [0x2ba59de4edd6]
[2020-01-12 20:45:03] ERROR:    flye-contigger(_ZNK11RepeatGraph14complementEdgeEP9GraphEdge+0xd8) [0x48cb48]
[2020-01-12 20:45:03] ERROR:    flye-contigger(_ZN11RepeatGraph13validateGraphEv+0x8e6) [0x48da56]
[2020-01-12 20:45:03] ERROR:    flye-contigger(main+0x8c1) [0x43aa21]
[2020-01-12 20:45:03] ERROR:    /lib64/libc.so.6(__libc_start_main+0xf5) [0x2ba59e7fc505]
[2020-01-12 20:45:03] ERROR:    flye-contigger() [0x43b51f]
[2020-01-12 20:45:06] root: ERROR: Command '['flye-contigger', '--graph-edges', '/srv/scratch/canetoad/CaneToad-May15/assemblies/2019-11-28.Flye2.6/flye_all/21-trestle/repeat_graph_edges.fasta', '--reads', '/srv/scratch/canetoad/CaneToad-May15/raw/pacbio/fasta/can
etoad.6pblib.subreads.fasta,/srv/scratch/canetoad/CaneToad-May15/data/2019-11-15.ONT/canetoad.ont.2019-11-15.500bp.pass.fasta,/srv/scratch/canetoad/CaneToad-May15/data/2019-11-15.ONT/canetoad.ont.2019-11-15.500bp.fail.fasta', '--out-dir', '/srv/scratch/canetoad/Ca
neToad-May15/assemblies/2019-11-28.Flye2.6/flye_all/30-contigger', '--config', '/apps/flye/20191215/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg', '--repeat-graph', '/srv/scratch/canetoad/CaneToad-May15/assemblies/2019-11-28.Flye2.6/flye_all/2
1-trestle/repeat_graph_dump', '--graph-aln', '/srv/scratch/canetoad/CaneToad-May15/assemblies/2019-11-28.Flye2.6/flye_all/20-repeat/read_alignment_dump', '--log', '/srv/scratch/canetoad/CaneToad-May15/assemblies/2019-11-28.Flye2.6/flye_all/flye.log', '--threads', 
'24', '--min-ovlp', '4000', '--kmer', '17']' died with <Signals.SIGABRT: 6>.
mikolmogorov commented 4 years ago

@cabbagesofdoom

Please run using --resume-from repeat, not just --resume (and send the full log). Sorry for the confusion! Also, I have just pushed an update to flye branch which potentially might fix the crash - so I recommend to pull and compile before making another run.

dcopetti commented 4 years ago

hi @fenderglass : here is the log file of the run dying after the contig stage, but before writing the graph: https://de.cyverse.org/dl/d/C6D34FFE-9646-4121-BF00-2015A3EA6EA2/rab_2kb_flye.log.gz Thanks, Dario

dcopetti commented 4 years ago

@fenderglass : before checking if the bug is fixed, I am still facing the problem with the installation. When doing make(on a virtual machine), the sys admin gets the following (sorry for the German, but it still seems understandable):

/usr/include/c++/4.8.2/bits/stl_algo.h:5499:44:   erfordert durch »void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<std::vector>< EdgeAlignment>*, std::vector<std::vector><EdgeAlignment> > >; _Compare = HaplotypeResolver::findVariantSegment(GraphEdge*, const std::vector<std::vector><EdgeAlignment> >&, const std::unordered_set<GraphEdge*>&)::__lambda11]«
repeat_graph/haplotype_resolver.cpp:255:62:   von hier erfordert
/usr/include/c++/4.8.2/bits/stl_algo.h:2263:35: Fehler: keine Übereinstimmung für Aufruf von »(HaplotypeResolver::findVariantSegment(GraphEdge*, const std::vector<std::vector><EdgeAlignment> >&, const std::unordered_set<GraphEdge*>&)::__lambda11) (std::vector<EdgeAlignment>&, const std::vector<EdgeAlignment>&)«
    while (__comp(*__first, __pivot))
                                   ^
repeat_graph/haplotype_resolver.cpp:253:7: Anmerkung: Kandidaten sind:
      [](GraphAlignment& a1, GraphAlignment& a2)
       ^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from repeat_graph/../sequence/sequence.h:11,
                 from repeat_graph/../sequence/sequence_container.h:12,
                 from repeat_graph/repeat_graph.h:9,
                 from repeat_graph/haplotype_resolver.h:7,
                 from repeat_graph/haplotype_resolver.cpp:1:
/usr/include/c++/4.8.2/bits/stl_algo.h:2263:35: Anmerkung: bool (*)(GraphAlignment&, GraphAlignment&) {aka bool (*)(std::vector<EdgeAlignment>&, std::vector<EdgeAlignment>&)} <Umformung>
    while (__comp(*__first, __pivot))
                                   ^
/usr/include/c++/4.8.2/bits/stl_algo.h:2263:35: Anmerkung:   Kandidat erwartet 3 Argumente, 3 angegeben
repeat_graph/haplotype_resolver.cpp:253:47: Anmerkung: HaplotypeResolver::findVariantSegment(GraphEdge*, const std::vector<std::vector><EdgeAlignment> >&, const std::unordered_set<GraphEdge*>&)::__lambda11
      [](GraphAlignment& a1, GraphAlignment& a2)
                                               ^
repeat_graph/haplotype_resolver.cpp:253:47: Anmerkung:   keine bekannte Umwandlung für Argument 2 von »const std::vector<EdgeAlignment>« nach »GraphAlignment& {aka std::vector<EdgeAlignment>&}«
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from repeat_graph/../sequence/sequence.h:11,
                 from repeat_graph/../sequence/sequence_container.h:12,
                 from repeat_graph/repeat_graph.h:9,
                 from repeat_graph/haplotype_resolver.h:7,
                 from repeat_graph/haplotype_resolver.cpp:1:
/usr/include/c++/4.8.2/bits/stl_algo.h:2266:34: Fehler: keine Übereinstimmung für Aufruf von »(HaplotypeResolver::findVariantSegment(GraphEdge*, const std::vector<std::vector><EdgeAlignment> >&, const std::unordered_set<GraphEdge*>&)::__lambda11) (const std::vector<EdgeAlignment>&, std::vector<EdgeAlignment>&)«
    while (__comp(__pivot, *__last))

...
make[1]: *** [repeat_graph/haplotype_resolver.o] Fehler 1
make[1]: Leaving directory `/var/scheff/Flye/src'
make: *** [all] Fehler 2

It may be minor, but I see this discrepancy: the error shows that a path was going to c++ version 4.8.2, while if I check the version I get

(base) bash-4.2$ gcc --version 
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39) 
Copyright (C) 2015 Free Software Foundation, Inc. 
This is free software; see the source for copying conditions.  There is NO 
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. 

(base) bash-4.2$ c++ --version 
c++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39) 
Copyright (C) 2015 Free Software Foundation, Inc. 
This is free software; see the source for copying conditions.  There is NO 
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. 

(base) bash-4.2$ which gcc 
/usr/bin/gcc 
(base) bash-4.2$ which c++ 
/usr/bin/c++ 

Do you think the version will matter? Or where should we look at the issue? Of note: the zlib error does not come up with his user (or with this software update). Thanks

mikolmogorov commented 4 years ago

@dcopetti Thanks for sending the log file. It says that the binaries were built in September - so it can not be the latest github code. You would need to grab the latest code from flye branch and compile it.

This brings us to you second message :) I can't read German, but it looks like a compilation error (it is not minor, if there was any error it means Flye did not compile). It most likely has something to do with you environment. GCC version is very old (released 6 years ago) - this is likely the cause. You would need a more recent GCC with zlib developer headers.

dcopetti commented 4 years ago

@fenderglass : the newest version of Flye (2.7b-b1526) was installed just a few hours ago, I will try it as soon as other jobs on the machine are completed. Gcc is still old:

-bash-4.2$ gcc --version
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39)
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

-bash-4.2$ /usr/local/Flye/bin/flye
usage: flye (--pacbio-raw | --pacbio-corr | --nano-raw |
             --nano-corr | --subassemblies) file1 [file_2 ...]
             --genome-size SIZE --out-dir PATH

             [--threads int] [--iterations int] [--min-overlap int]
             [--meta] [--plasmids] [--no-trestle] [--polish-target]
             [--keep-haplotypes] [--debug] [--version] [--help]
             [--resume] [--resume-from] [--stop-after]
flye: error: the following arguments are required: -o/--out-dir

I'll let you know if the run completes or not.

I started (and aborted soon after) a run with a small fastq input, the log says:

[2020-01-22 15:19:43] root: INFO: Starting Flye 2.7b-b1526
[2020-01-22 15:19:43] root: DEBUG: Cmd: /usr/local/Flye/bin/flye --nano-raw test.fq --genome-size 5g --out-dir testdir -t 4
[2020-01-22 15:19:43] root: DEBUG: Python version: 3.7.3 (default, Mar 27 2019, 22:11:17)
[GCC 7.3.0]
[2020-01-22 15:19:43] root: INFO: >>>STAGE: configure
[2020-01-22 15:19:43] root: INFO: Configuring run
[2020-01-22 15:19:50] root: INFO: Total read length: 1085688571
[2020-01-22 15:19:50] root: INFO: Input genome size: 5000000000
[2020-01-22 15:19:50] root: INFO: Estimated coverage: 0
[2020-01-22 15:19:50] root: WARNING: Expected read coverage is 0, the assembly is not guaranteed to be optimal in this setting. Are you sure that the genome size was entered correctly?
[2020-01-22 15:19:50] root: INFO: Reads N50/N90: 42773 / 11375
[2020-01-22 15:19:50] root: INFO: Minimum overlap set to 5000
[2020-01-22 15:19:50] root: INFO: Selected k-mer size: 17
[2020-01-22 15:19:50] root: INFO: >>>STAGE: assembly
[2020-01-22 15:19:50] root: INFO: Assembling disjointigs
[2020-01-22 15:19:50] root: DEBUG: -----Begin assembly log------
[2020-01-22 15:19:50] root: DEBUG: Running: flye-assemble --reads test.fq --out-asm /scratch/dario/flye/testdir/00-assembly/draft_assembly.fasta --genome-size 5000000000 --config /usr/local/Flye/flye/config/bin_cfg/asm_raw_reads.cfg --log /scratch/dario/flye/testdir/flye.log --threads 4 --min-ovlp 5000 --kmer 17
[2020-01-22 15:19:50] DEBUG: Build date: Jan 22 2020 14:07:12
[2020-01-22 15:19:50] DEBUG: Total RAM: 960 Gb
[2020-01-22 15:19:50] DEBUG: Available RAM: 252 Gb

(note: here it says GCC 7.3) is there a way to know if the run will complete, before waiting for about a week to get to the end of the 3- step? In that way I can open a ticket again to fix it. I'll keep you posted

mikolmogorov commented 4 years ago

@dcopetti looks like the right version now. Please use --resume-from repeat. Unfortunately, there is now way to bypass this step since this is where the error initially was. Let me know how it goes!

GCC 7.3 in the log is related to Python build, could be different from the one used for binaries. You might also have multiple GCC environments on the machine.

Psy-Fer commented 4 years ago

I've been running this sample from the repeat step, since the 6th of Jan (now the 23rd), and it's still chugging along. However the step it is on seems to only be using a single core. I think this might be why it's taking forever.

stderr

[2020-01-06 10:26:05] INFO: Resuming previous run
[2020-01-06 10:26:05] INFO: >>>STAGE: repeat
[2020-01-06 10:26:05] INFO: Building and resolving repeat graph
[2020-01-06 10:26:05] INFO: Reading sequences
[2020-01-06 10:38:19] INFO: Building repeat graph
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2020-01-07 12:55:34] INFO: Median overlap divergence: 0.0695214
[2020-01-16 04:43:44] INFO: Aligning reads to the graph
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2020-01-16 15:59:55] INFO: Aligned read sequence: 75876783523 / 114451191639 (0.662962)
[2020-01-16 15:59:56] INFO: Median overlap divergence: 0.0961182
[2020-01-16 16:01:03] INFO: Mean edge coverage: 52
[2020-01-16 16:01:17] INFO: Simplifying the graph

flye.log

[2020-01-16 16:01:13] DEBUG: 451284 len:1855    cov:34  mult:0.653846
[2020-01-16 16:01:13] DEBUG: 540486 len:1852    cov:34  mult:0.653846
[2020-01-16 16:01:13] DEBUG: -451284    len:1855    cov:34  mult:0.653846
[2020-01-16 16:01:13] DEBUG: 543628 len:2084    cov:38  mult:0.730769
[2020-01-16 16:01:13] DEBUG: -451285    len:1835    cov:0   mult:0
[2020-01-16 16:01:13] DEBUG: 451286 len:2204    cov:15  mult:0.288462
[2020-01-16 16:01:13] DEBUG: -451286    len:2204    cov:15  mult:0.288462
[2020-01-16 16:01:13] DEBUG: -316978    len:2768    cov:12  mult:0.230769
[2020-01-16 16:01:13] DEBUG: 451287 len:2316    cov:7   mult:0.134615
[2020-01-16 16:01:13] DEBUG: -192030    len:511 cov:7   mult:0.134615
[2020-01-16 16:01:13] DEBUG: 451291 len:955 cov:39  mult:0.75
[2020-01-16 16:01:13] DEBUG: 192031 len:11419   cov:24  mult:0.461538
[2020-01-16 16:01:13] DEBUG: -451291    len:955 cov:39  mult:0.75
[2020-01-16 16:01:13] DEBUG: 451293 len:514 cov:0   mult:0
[2020-01-16 16:01:13] DEBUG: 451294 len:1916    cov:55  mult:1.05769
[2020-01-16 16:01:13] DEBUG: Unique coverage threshold 80
[2020-01-16 16:01:14] DEBUG: Writing Dot
[2020-01-16 16:01:17] INFO: Simplifying the graph
[2020-01-16 16:01:20] DEBUG: Read coverage cutoff: 10
[2020-01-16 16:01:24] DEBUG: [SIMPL] Removed 5017 paths with low coverage

So it's been on this stage since the 16th, so 7 days. Should I cross my fingers and toes and hope it gets to the end? lol

James

mikolmogorov commented 4 years ago

@Psy-Fer I see, it definitely is taking way longer then expected for this stage.. But again for a typical run the pipeline goes from "Median overlap divergence" to "Aligning reads to the graph" in no longer than a few minutes (usually seconds), but it was 9 days for you..

I guess let's try to wait at this point. Hopefully the crash is fixed now, but optimizing those bottlenecks for very complex assemblies is gonna be another question :)

Psy-Fer commented 4 years ago

Yea, optimisation can be a real pain with this stuff.

I think I spoke a day too soon, it's now doing things.

[2020-01-16 16:01:24] DEBUG: [SIMPL] Removed 5017 paths with low coverage
[2020-01-24 01:24:51] DEBUG:    Connection -105580  -96623  13
[2020-01-24 01:24:51] DEBUG:    Connection -105581  23834   7
[2020-01-24 01:24:57] DEBUG:    Connection -6090    -6088   6
[2020-01-24 01:27:09] DEBUG:    Connection -128539  -92901  3
[2020-01-24 01:27:17] DEBUG:    Connection 507191   507193  6
[2020-01-24 01:27:54] DEBUG:    Connection -42181   62494   10
[2020-01-24 01:27:54] DEBUG:    Connection -516961  62495   19
[2020-01-24 01:28:24] DEBUG:    Connection 91077    96269   7
[2020-01-24 01:28:24] DEBUG:    Connection 91078    96270   6
[2020-01-24 01:28:46] DEBUG:    Connection 80567    104217  4
[2020-01-24 01:28:46] DEBUG:    Connection 80568    -535554 4
[2020-01-24 01:30:07] DEBUG:    Connection 11807    -76667  4
[2020-01-24 01:30:29] DEBUG:    Connection 110944   507199  2
[2020-01-24 01:30:29] DEBUG:    Connection 110945   507200  7
[2020-01-24 01:31:09] DEBUG:    Connection 123733   -53442  2
[2020-01-24 01:32:28] DEBUG:    Connection -61408   38814   3
[2020-01-24 01:32:28] DEBUG:    Connection 514482   38813   8
[2020-01-24 01:34:15] DEBUG:    Connection 3248 11318   3
[2020-01-24 01:34:15] DEBUG:    Connection 11317    11319   6
[2020-01-24 01:34:39] DEBUG:    Connection 512485   67429   8
[2020-01-24 01:34:58] DEBUG:    Connection -64422   513391  3
[2020-01-24 01:35:34] DEBUG:    Connection -63824   -73258  2
[2020-01-24 01:36:22] DEBUG:    Connection -505749  -512335 4
[2020-01-24 01:36:22] DEBUG:    Connection 529555   -512334 2
[2020-01-24 01:37:37] DEBUG:    Connection 95504    -118538 4
[2020-01-24 01:37:37] DEBUG:    Connection 95505    -118537 3
[2020-01-24 01:37:45] DEBUG:    Connection -503562  -503560 7
[2020-01-24 01:37:45] DEBUG:    Connection 511835   -89228  3
[2020-01-24 01:40:06] DEBUG:    Connection -115885  116096  6
[2020-01-24 01:40:14] DEBUG:    Connection -117390  -24801  2
...

and it's still going.

Here's hoping I don't hit the wall time limit on this node.

Psy-Fer commented 4 years ago

aaand I hit the node wall time.

I will assemble this genome if it kills me. I'll ask the sys admin to up the wall time for this node, and start again. Start the clock, see you in 20 something days

mikolmogorov commented 4 years ago

@Psy-Fer tough one :( Do you have the full log of that run? If there is a particular bottleneck, maybe I can quickly fix it..

Psy-Fer commented 4 years ago

Sure, here it is.

https://www.dropbox.com/s/hbb3hg6qp3ob8wa/wall_time.log.1.gz?dl=0

Thanks for your help.

P.S. I keep telling people how good flye is, both because it is really good, and because of the support.

mikolmogorov commented 4 years ago

@Psy-Fer Thanks!

Well, looks like this one's on me. I've recently added code that validates the graph structure to trace where the bug occurs. Normally, this check takes seconds, but it was taking 8 days on your dataset :)

I've disabled these checks now in the latest flye branch - so please do update before running it again. Should be way faster. Hopefully it will work this time!

Mikhail

Psy-Fer commented 4 years ago

Ha! that explains it.

Okay, downloaded, compiled, and running! --version works now too

[2020-01-28 16:08:41] INFO: Starting Flye 2.7b-b1530
[2020-01-28 16:08:41] INFO: Resuming previous run
[2020-01-28 16:08:41] INFO: >>>STAGE: repeat
[2020-01-28 16:08:41] INFO: Building and resolving repeat graph
[2020-01-28 16:08:42] INFO: Reading sequences

Fingers crossed!

Thanks again!

mikolmogorov commented 4 years ago

@Psy-Fer cool! Just to double check - are you running the one from flye, not flye-devel? I originally put the fix only to flye (synchronized both just now).

Psy-Fer commented 4 years ago

Ha! No. But I am now.

cabbagesofdoom commented 4 years ago

@fenderglass Quick update from me too... I had to wait for a few other assemblies to finish (including some flye ones!) but the sysadmin put on the flye branch version today and I have set it running again from the repeat stage:

[2020-01-28 13:34:59] INFO: Starting Flye 2.7b-b1526
[2020-01-28 13:34:59] INFO: Resuming previous run
[2020-01-28 13:34:59] INFO: >>>STAGE: repeat
[2020-01-28 13:34:59] INFO: Building and resolving repeat graph
[2020-01-28 13:34:59] INFO: Reading sequences
[2020-01-28 13:52:38] INFO: Building repeat graph
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2020-01-28 21:12:36] INFO: Median overlap divergence: 0.0618715
...
Psy-Fer commented 4 years ago

Ahh, I have an issue now, trying to compile...

/usr/include/c++/4.8.2/bits/alloc_traits.h:393:57:   required from ‘static decltype (_S_construct(__a, __p, (forward<_Args>)(std::allocator_traits::construct::__args)...)) std::allocator_traits<_Alloc>::construct(_Alloc&, _Tp*, _Args&& ...) [with _Tp = libcuckoo_bucket_container<FastaRecord::Id, OverlapContainer::IndexVecWrapper, std::allocator<std::pair<const FastaRecord::Id, OverlapContainer::IndexVecWrapper> >, unsigned char, 4ul>::bucket; _Args = {}; _Alloc = std::allocator<std::pair<const FastaRecord::Id, OverlapContainer::IndexVecWrapper> >; decltype (_S_construct(__a, __p, (forward<_Args>)(std::allocator_traits::construct::__args)...)) = <type error>]’
/home/jamfer/work/Flye/lib/libcuckoo/libcuckoo_bucket_container.hh:118:50:   required from ‘libcuckoo_bucket_container<Key, T, Allocator, Partial, SLOT_PER_BUCKET>::libcuckoo_bucket_container(libcuckoo_bucket_container<Key, T, Allocator, Partial, SLOT_PER_BUCKET>::size_type, const allocator_type&) [with Key = FastaRecord::Id; T = OverlapContainer::IndexVecWrapper; Allocator = std::allocator<std::pair<const FastaRecord::Id, OverlapContainer::IndexVecWrapper> >; Partial = unsigned char; long unsigned int SLOT_PER_BUCKET = 4ul; libcuckoo_bucket_container<Key, T, Allocator, Partial, SLOT_PER_BUCKET>::size_type = long unsigned int; libcuckoo_bucket_container<Key, T, Allocator, Partial, SLOT_PER_BUCKET>::allocator_type = std::allocator<std::pair<const FastaRecord::Id, OverlapContainer::IndexVecWrapper> >]’
/home/jamfer/work/Flye/lib/libcuckoo/cuckoohash_map.hh:105:58:   required from ‘cuckoohash_map<Key, T, Hash, KeyEqual, Allocator, SLOT_PER_BUCKET>::cuckoohash_map(cuckoohash_map<Key, T, Hash, KeyEqual, Allocator, SLOT_PER_BUCKET>::size_type, const Hash&, const KeyEqual&, const Allocator&) [with Key = FastaRecord::Id; T = OverlapContainer::IndexVecWrapper; Hash = std::hash<FastaRecord::Id>; KeyEqual = std::equal_to<FastaRecord::Id>; Allocator = std::allocator<std::pair<const FastaRecord::Id, OverlapContainer::IndexVecWrapper> >; long unsigned int SLOT_PER_BUCKET = 4ul; cuckoohash_map<Key, T, Hash, KeyEqual, Allocator, SLOT_PER_BUCKET>::size_type = long unsigned int]’
repeat_graph/../sequence/overlap.h:292:21:   required from here
/home/jamfer/work/Flye/lib/libcuckoo/libcuckoo_bucket_container.hh:58:35: warning: missing initializer for member ‘std::array<bool, 4ul>::_M_elems’ [-Wmissing-field-initializers]
make[1]: *** [repeat_graph/haplotype_resolver.o] Error 1
make[1]: Leaving directory `/home/jamfer/work/Flye/src'
make: *** [all] Error 2

Thoughts?

I'm going to try nuking my environment and building it all again just to make sure.