Closed WallyL closed 5 years ago
I suspect you won't want to redo analyses, but there is a newer version available.
> git clone https://github.com/marbl/canu.git
> cd canu/src
> git checkout v1.9
> make -j 8
Back to your failure. What's in unitigging/3-overlapErrorAdjustment/oea.000004.out
? It's failing quickly, which is usually a good sign.
Thanks, Brian, I will definitely update to the latest version for future runs... haven't run Canu in awhile.
Here is the output. Looks like the dreaded segmentation fault...
Initializing.
Opening gkpStore '../Ah_51307.gkpStore'.
Correcting reads 159526 to 212935.
Reading 3572082 corrections from './red.red'.
Correcting 300057619 bases with 2192621 indel adjustments.
--Allocate 286 + 16 + 1 MB for bases, adjusts and reads.
Corrected 299859587 bases with 68891 substitutions, 198035 deletions and 3 insertions.
Loading overlaps.
Read_Olaps()-- Loading 39844812 overlaps from '../Ah_51307.ovlStore' for reads 159526 to 212935
--Allocate 1215 MB for overlaps.
Read_Olaps()-- Loaded 39844812 overlaps -- 19956174 normal and 19888638 innie.
Sorting overlaps.
Failed with 'Segmentation fault'; backtrace (libbacktrace):
AS_UTL/AS_UTL_stackTrace.C::97 in _Z17AS_UTL_catchCrashiP9siginfo_tPv()
(null)::0 in (null)()
overlapErrorAdjustment/correctOverlaps.H::172 in _ZN18Olap_Info_t_by_bIDclERK11Olap_Info_tS2_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/predefined_ops.h::123 in _ZN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEclIP11Olap_Info_tS6_EEbT_T0_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1897 in _ZSt21__unguarded_partitionIP11Olap_Info_tN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEET_S7_S7_S7_T0_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1918 in _ZSt27__unguarded_partition_pivotIP11Olap_Info_tN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEET_S7_S7_T0_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1948 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1949 in _ZSt16__introsort_loopIP11Olap_Info_tlN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_T1_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::1963 in _ZSt6__sortIP11Olap_Info_tN9__gnu_cxx5__ops15_Iter_comp_iterI18Olap_Info_t_by_bIDEEEvT_S7_T0_()
/usr/local/apps/eb/GCCcore/5.4.0/include/c++/5.4.0/bits/stl_algo.h::4729 in _ZNSt9__cxx19984sortIP11Olap_Info_t18Olap_Info_t_by_bIDEEvT_S4_T0_()
overlapErrorAdjustment/correctOverlaps.C::169 in main()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
./oea.sh: line 109: 29775 Segmentation fault (core dumped) $bin/correctOverlaps -G ../Ah_51307.gkpStore -O ../Ah_51307.ovlStore -R $minid $maxid -e 0.020 -l 500 -c ./red.red -o ./$jobid.oea.WORKING```
If you can change code, change line 178 of src/overlapErrorAdjustment/correctOverlaps.H from
return(a.innie != b.innie);
to
return(a.innie < b.innie);
and recompile.
Will do, thanks again!
Please let me know if this works or not. It's just a guess.
Brian, I changed the code as you suggested, but it didn't work...
Since my other 4 assemblies actually got worse with more data, doubtful this sample will improve. If you have other suggestions for code changes and want me to test them to fix this particular bug in v1.7, I'll be happy to do so, but I'm going to run any new tests with one of the later versions.
It looks like version 1.8 is the latest released version, would you still recommend that I move to v1.9, which is developmental? I have both installed.
Best, Walt
Well, darn. Would you be willing to share the failing assembly so I can debug it? FTP directions are in the FAQ.
v1.9 is suggested. It is essentially the next release - I'm not entirely sure why we aren't making an actual release for it.
Not sure what all you needed, so I made a tar file, WallyL_failed_v1.7_run.tar
, with everything in it, i.e. correction_trim dir. and the assy dir. It's ~ 50GB and the upload is complete.
Saw the file there but had some trouble extracting it, do you mind re-sending and posting the md5 here to confirm? The unitigging folder should be sufficient for debugging, no need for the correction/trimming folders. You can also gzip to make it smaller for the upload if you want the transfer to be faster.
Hi Sergey,
We had some connectivity issues in our building last week, so it possibly got corrupted during transfer. I am uploading a tarball of the unitigging dir. now.
The md5 = 7731a794ba14ba5190c09fd836b2a0c8
Walt
Got it now thanks, the MD5 looks correct as well.
Can you also send the Ah_51307.gkpStore from one level up from unitigging, I forgot to tell you to run tar with following symlinks. It also looks like you have also run canu 1.8 on this folder after 1.7 failed? That would likely not work but shouldn't have caused your initial error.
Yeah, I was trying different things to get past the assy error, so thought I'd try to assemble the trimmed reads with the later version. So, as a rule, this is not a good idea?
The reason I ask is because I have corrected and trimmed these 5 samples 3 diff. ways (the v1.7 assembly error occurred only on the 51307 sample corrected as described above).
However, for another v1.7 corrected set (corMhapSensitivity=high corOutCoverage=200 corMinCoverage=6
) I ran the v 1.9 assembler and pointed to corrected/trimmed dirs. and all appear to have run just fine after removing gnuplotTested=true
, which is no longer recognized..
Ah_51307.gkpStore.tgz file is now uploaded. The md5 is 44c70e7af05a0833df428366d4ce5afa
Sorry for getting back so late, we've made some large changes in this code and I tested your assembly with the new code and couldn't reproduce the crash. I think it's been fixed as part of the changes. The assembly isn't great, but it seems you've got super-high coverage here, downsampling to 100 or 200x before assembly may help. Here are the stats:
Total units: 326
BasesInFasta: 8465850
Min: 2,420
Max: 677,648
N25: 279,570 COUNT: 5
N50: 138,022 COUNT: 14
N75: 18,472 COUNT: 78
Were you ever able to get an assembly, do you want me to share this one?
Thanks, no worries. I have generated several assys for this sample using ver. 1.9, but I am getting similar metrics to yours... very fragmented for PB bacterial data.
All 5 of the genomes in this run are refractory to single contig assy- we've also run them through PB HGAP4 and got similar results to Canu. We are assuming that it is either a sample qual. / lib. issue or there's some weird structural/repetition effects... definitely not E. coli! Size should be around 4.8-5.2 mB. If I run Knot software, post-Canu, I can get some down to 5 -12 contigs.
We are in the process of generating some ONT data to throw in the mix and hopefully that will resolve things. I'll keep you posted, if you're interested.
Is this a clonal sample or a plate scrape or something else? It seems given the genome size that there may be more than one strain that's causing the assembly to be split/generating a bigger genome. If that's the case, ONT isn't going to help since it will capture the same variations again.
I'm going to close the issue since the oea issue is fixed, feel free to comment with any updates on it though. I'll also try down-sampling the coverage on your data and see if that results in any improvements.
I'm assembling several PacBio seq'd multiplexed bacterial genomes (Sequel data) using Canu ver. 1.7. All jobs were run on a Linux cluster using a single node, 32 core, 500 GB ram.
The assemblies have been refractory and too fragmented, so I've been testing different assembly approaches after correcting and trimming the raw data separately for each sample.
All assemblies completed except one (it has completed for all other tests using different correction/trim/assy parameters). I realize that the coverage is
way
in excess here, but at this point I'm trying anything.I got the "Don't panic/restart" error, so I restarted the assembly 3-4 times, but to no avail.
Correction and trimming both ran to completion, assembly failed using the following cmds: