isovic / racon

Ultrafast consensus module for raw de novo genome assembly of long uncorrected reads. http://genome.cshlp.org/content/early/2017/01/18/gr.214270.116 Note: This was the original repository which will no longer be officially maintained. Please use the new official repository here:
https://github.com/lbcb-sci/racon
MIT License
268 stars 48 forks source link

Segmentation Fault #90

Open wangshun1121 opened 6 years ago

wangshun1121 commented 6 years ago

Hello:

I got an assembly from MeCAT, then removed removed heterozygous redundancy and assembled scaffolds using redundans. Then after filling gap with PBJelly, I tried to polish my assembly using following commands:

# First Polish
minimap2 -t 70 -cx map-pb ./data/out/jelly.out.fasta ../PacBio.fa |gzip -c >./Racon/Jelly.align.paf.gz
racon ../PacBio.fa ./Racon/Jelly.align.paf.gz ./data/out/jelly.out.fasta -t 70 -u -f 1>./Racon/Racon1.fa

# Second Polish
minimap2 -t 70 -cx map-pb ./Racon/Racon1.fa ../PacBio.fa|gzip -c >./Racon/Racon.align.paf.gz
racon ../PacBio.fa ./Racon/Racon.align.paf.gz ./Racon/Racon1.fa -t 70 -u -f > ./Racon/Racon2.fa

I got following messages after First Polish:

[racon::Polisher::initialize] loaded target sequences
[racon::Polisher::initialize] loaded sequences
[racon::Polisher::initialize] loaded overlaps
[racon::Polisher::initialize] aligned overlap 20755039/20755039
[racon::Polisher::initialize] transformed data into windows
[racon::Window::generate_consensus] warning: contig 131 might be chimeric in window 475!
[racon::Window::generate_consensus] warning: contig 708 might be chimeric in window 82!
[racon::Window::generate_consensus] warning: contig 1066 might be chimeric in window 386!
[racon::Window::generate_consensus] warning: contig 1103 might be chimeric in window 183!
[racon::Window::generate_consensus] warning: contig 1745 might be chimeric in window 632!
[racon::Window::generate_consensus] warning: contig 1765 might be chimeric in window 2234!
[racon::Window::generate_consensus] warning: contig 2321 might be chimeric in window 1146!
[racon::Window::generate_consensus] warning: contig 2793 might be chimeric in window 325!
[racon::Window::generate_consensus] warning: contig 3109 might be chimeric in window 161!
[racon::Window::generate_consensus] warning: contig 3932 might be chimeric in window 171!
[racon::Window::generate_consensus] warning: contig 3939 might be chimeric in window 14!
[racon::Polisher::polish] generated consensus for window 1763070/1763070
Segmentation fault (core dumped)

but still, I got 3998 sequences of totally 880M.

Then, when running racon again, it failed:

[racon::Polisher::initialize] loaded target sequences
[racon::Polisher::initialize] loaded sequences
[racon::Polisher::initialize] loaded overlaps
[racon::Polisher::initialize] aligned overlap 21715775/21715775
[racon::Polisher::initialize] transformed data into windows
Segmentation fault (core dumped)
wangshun1121 commented 6 years ago

I installed racon from bioconda, v 1.3.1

rvaser commented 6 years ago

Hello, I am not sure what might cause segmentation fault after polishing is done. How long did racon take in the first iteration? Could you maybe download racon from github, compile it in debug mode and run it through gdb?

Best regards, Robert

wangshun1121 commented 6 years ago

It takes about 10 hours in the first round. But in the second round, racon failed. Now I am trying racon_wrapped script, and waiting for the results.

wangshun1121 commented 6 years ago

I tried following commands in the second round:

racon_wrapper --split 876890000 -t 70 -u -f ../PacBio.fa ./Racon/Racon.align.paf.gz ./Racon/Racon1.fa >./Racon/Racon2.fa

and luckily, it succeed!

[RaconWrapper::run] preparing data with rampler
[RaconWrapper::run] processing data with racon
[racon::Polisher::initialize] loaded target sequences
[racon::Polisher::initialize] loaded sequences
[racon::Polisher::initialize] loaded overlaps
[racon::Polisher::initialize] aligned overlap 21713263/21713263
[racon::Polisher::initialize] transformed data into windows
[racon::Window::generate_consensus] warning: contig 102 might be chimeric in window 286!
[racon::Window::generate_consensus] warning: contig 175 might be chimeric in window 575!
[racon::Window::generate_consensus] warning: contig 226 might be chimeric in window 1367!
[racon::Window::generate_consensus] warning: contig 285 might be chimeric in window 606!
[racon::Window::generate_consensus] warning: contig 304 might be chimeric in window 1357!
[racon::Window::generate_consensus] warning: contig 319 might be chimeric in window 8973!
[racon::Window::generate_consensus] warning: contig 319 might be chimeric in window 9913!
[racon::Window::generate_consensus] warning: contig 354 might be chimeric in window 287!
[racon::Window::generate_consensus] warning: contig 391 might be chimeric in window 720!
[racon::Window::generate_consensus] warning: contig 578 might be chimeric in window 2181!
[racon::Window::generate_consensus] warning: contig 591 might be chimeric in window 6399!
[racon::Window::generate_consensus] warning: contig 601 might be chimeric in window 367!
[racon::Window::generate_consensus] warning: contig 788 might be chimeric in window 35!
[racon::Window::generate_consensus] warning: contig 978 might be chimeric in window 1433!
[racon::Window::generate_consensus] warning: contig 985 might be chimeric in window 1026!
[racon::Window::generate_consensus] warning: contig 1019 might be chimeric in window 40!
[racon::Window::generate_consensus] warning: contig 1068 might be chimeric in window 2133!
[racon::Window::generate_consensus] warning: contig 1123 might be chimeric in window 1589!
[racon::Window::generate_consensus] warning: contig 1141 might be chimeric in window 365!
[racon::Window::generate_consensus] warning: contig 1271 might be chimeric in window 1!
[racon::Window::generate_consensus] warning: contig 1355 might be chimeric in window 135!
[racon::Window::generate_consensus] warning: contig 1634 might be chimeric in window 5931!
[racon::Window::generate_consensus] warning: contig 1677 might be chimeric in window 2362!
[racon::Window::generate_consensus] warning: contig 1677 might be chimeric in window 3067!
[racon::Window::generate_consensus] warning: contig 1808 might be chimeric in window 437!
[racon::Window::generate_consensus] warning: contig 2299 might be chimeric in window 22!
[racon::Window::generate_consensus] warning: contig 2319 might be chimeric in window 123!
[racon::Window::generate_consensus] warning: contig 2400 might be chimeric in window 769!
[racon::Window::generate_consensus] warning: contig 2833 might be chimeric in window 767!
[racon::Window::generate_consensus] warning: contig 3203 might be chimeric in window 155!
[racon::Window::generate_consensus] warning: contig 3250 might be chimeric in window 86!
[racon::Window::generate_consensus] warning: contig 3334 might be chimeric in window 962!
[racon::Window::generate_consensus] warning: contig 3482 might be chimeric in window 1564!
[racon::Window::generate_consensus] warning: contig 3713 might be chimeric in window 269!
[racon::Window::generate_consensus] warning: contig 3748 might be chimeric in window 56!
[racon::Polisher::polish] generated consensus for window 1755543/1755543
rvaser commented 6 years ago

Well, that is great! But kinda weird that you got segmentation fault instead of killed if it was a memory issue.

Is the number of sequences now as expected?

wangshun1121 commented 6 years ago

The memory is more than abundant! I have 256G memory and racon used 160G at most.

rvaser commented 6 years ago

If you ran the second iteration on top of the 3000 sequences only, try running the wrapper on the first iterstion as well.