Segmentation faults related to input PAF file

Sandman2127 commented 3 years ago

Hi isovic, I've been struggling with seg faults using racon version v1.4.22 for around a few days and I believe I've narrowed the problem to my PAF file:

My commands and outputs are as follows:

Minimap2

/opt/miniconda/bin/minimap2 --secondary=no -t 160 -x sr contig_1505_553_only.fa C041_R1_001.adapter.trimmed.fq.gz > C041_R1_001.adapter.trimmed.1.paf

MM2 output (successful):

[M::main] Version: 2.17-r941 [M::main] CMD: /opt/miniconda/bin/minimap2 -t 160 -x sr contig_1505_553_only.fa C041_R1_001.adapter.trimmed.fq.gz [M::main] Real time: 1187.529 sec; CPU: 28108.750 sec; Peak RSS: 2.294 GB

Racon:

$RACON --threads 160 C041_R1_001.adapter.trimmed.fq.gz C041_R1_001.adapter.trimmed.1.paf contig_1505_553_only.fa > ${ASSEMBLY%.fa}.racon.${iteration}.fa

The reason I know the problem is related to the paf file is: 1) I've used this same racon build to successfully polish other genomes 2) Racon successfully loads the reference & reads, but always seg faults on my alignment file 3) the racon stdout is as follows:

Racon stdout (segmentation fault):

[racon::Polisher::initialize] loaded target sequences 0.042702 s [racon::Polisher::initialize] loaded sequences 2944.435211 s /home/BIOTECH/dmsanders/progs/genome_assembly/modules/polish/runRacon.sh: line 65: 3284648 Segmentation fault (core dumped) $RACON --threads $Threads $fastq ${fastq%.fq.gz}.${iteration}.paf $ASSEMBLY > ${ASSEMBLY%.fa}.racon.${iteration}.fa

For some reason with this dataset it never succeeds in loading the overlap file (3rd initializing operation)

$ head -3284649 C041_R1_001.adapter.trimmed.1.paf | tail -n3 ST-E00299:108:HMV3LCCXX:1:1104:1377:44837 150 8 47 - contig_553 8994067 4642519 4642558 39 39 9 tp:A:P cm:i:3 s1:i:39 s2:i:31 rl:i:0 ST-E00299:108:HMV3LCCXX:1:1104:1418:44837 150 14 143 + contig_553 8994067 1389092 1389221 111 129 3 tp:A:P cm:i:12 s1:i:111 s2:i:106 rl:i:0 ST-E00299:108:HMV3LCCXX:1:1104:13626:35713 150 4 144 + contig_553 8994067 5316405 5316545 140 140 60 tp:A:P cm:i:20 s1:i:140 s2:i:0 rl:i:0

This sure feels like a poisoned read to me, but I honestly can't tell any differences between the center line (above) racon calls out as bad vs anything else. Any idea what could be causing this seg fault, or what to try ?

The machine I'm using to run Racon has 160 threads and 3Tb ram, so I don't think resources are the issue here.

rvaser commented 3 years ago

Hello, can you please check which commit you are on? Version 1.4.22 had two commits with no major changes to the codebase, but int the last commit (c248b4da0) a week ago I tried to fix a parsing bug (which could cause this segmentation fault). Could you maybe send me the gzipped paf file?

Best regards, Robert

Sandman2127 commented 3 years ago

Hi Robert, Thanks for the quick response. Looks like I am on that commit.

$ git log

commit c248b4da04a1109151f503412fffaed8f0cfc0f5 (HEAD -> master, origin/master, origin/HEAD) Author: rvaser <robert.vaser@gmail.com> Date: Tue Jun 22 08:20:02 2021 +0800

Fix overlap parsing bug

I'm gzipping it now, will an amazon S3 bucket work (I can make it open to the public) or would you prefer a different transfer means ?

Sandman2127 commented 3 years ago

Hi Robert, The data is in a public bucket to be shared with you right now.

In case you don't use s3 very often its very easy to use:

download via web interface:

go to: https://s3.console.aws.amazon.com/s3/object/racondata?region=us-east-2&prefix=C041_R1_001.adapter.trimmed.1.paf.gz

click download, if this doesn't work, try:

download via CLI:

install aws command line interface

conda install -c conda-forge awscli

make sure your in the right region to pull the data, use us-east-2 for this data, enter your aws credentials, you can find all that data online in your aws security credentials:

$ aws configure AWS Access Key ID [****]: AWS Secret Access Key [***]: Default region name [us-east-2]: <if not us-east-2, enter it here> Default output format [None]:

pull the data:

$ aws s3 cp s3://racondata/C041_R1_001.adapter.trimmed.1.paf.gz .

If you prefer a different transfer means, I'm also happy to do that, just let me know!

Sandman2127 commented 3 years ago

Hi Robert, This google drive link may be easier:

https://drive.google.com/file/d/1eRRuRpq-XQFoUllxTGkcMHY5pAWQM8wm/view?usp=sharing

rvaser commented 3 years ago

I downloaded the file, thanks. Running Racon with random reads/target parses through the whole paf without errors and filters the whole file as expected. Could you maybe send the reads and the assembly as well?

Sandman2127 commented 3 years ago

Hi Robert, I've put both of those files here:

Fasta file with 2 small contigs I'm trying to polish: https://racondata.s3.us-east-2.amazonaws.com/contig_1505_553_only.fa

44 Gb of R1 Fastq data: https://racondata.s3.us-east-2.amazonaws.com/C041_R1_001.adapter.trimmed.fq.gz

Please let me know if there is anything else I can do to help and thank you so much for helping me troubleshoot this.

tanaes commented 3 years ago

Encountered same issue with loading the overlaps using the most recent commit. Targets and sequences loaded, then segfault.

Rolling back a commit and recompiling fixed issue for me.

Sandman2127 commented 3 years ago

Thanks @tanaes, I'm giving that a go rn, I'll report back if it works !

rvaser commented 3 years ago

Thanks for the files, I have recreated the segmentaiton fault and will let you know once I fix it.

Sandman2127 commented 3 years ago

Encountered same issue with loading the overlaps using the most recent commit. Targets and sequences loaded, then segfault.

Rolling back a commit and recompiling fixed issue for me.

Thanks @tanaes, I tried this rollback, just 1 commit to : 378dd810e728... update submodule ...

[STDOUT]: running: /home/BIOTECH/dmsanders/progs/racon/build/bin/racon --threads 160 C041_R1_001.adapter.trimmed.fq.gz C041_R1_001.adapter.trimmed.1.paf contig_1505_553_only.fa > contig_1505_553_only.racon.1.fa [racon::Polisher::initialize] loaded target sequences 0.041458 s [racon::Polisher::initialize] loaded sequences 3009.106979 s [racon::Polisher::initialize] loaded overlaps 850.978313 s

as we can see it finished loading the overlaps, but it just seems to be sitting with all the data in memory but no processes running. This is quite odd as these two contigs are tiny relative to previous whole genome work we've used Racon on.

Thanks for looking into this @rvaser, patiently awaiting your results...

rvaser commented 3 years ago

I think I fixed it finally, please try the latest commit (@Sandman2127, your data now works).

Sandman2127 commented 3 years ago

Running it now! Thank you so much Robert. I'll get back to you with results!

rvaser commented 3 years ago

@Sandman2127, any luck?

Sandman2127 commented 3 years ago

Hi Robert, I'm on my third run of polishing an it looks like its going to work out! The first two finished without error!

You did it, problem fixed, thank you sooo much! I'll close this now, you've solved it with your recent commit!

rvaser commented 3 years ago

Good to hear :)

isovic / racon