isovic / racon

Ultrafast consensus module for raw de novo genome assembly of long uncorrected reads. http://genome.cshlp.org/content/early/2017/01/18/gr.214270.116 Note: This was the original repository which will no longer be officially maintained. Please use the new official repository here:
https://github.com/lbcb-sci/racon
MIT License
257 stars 48 forks source link

error:Overlap not computed #206

Open Mondlii opened 2 years ago

Mondlii commented 2 years ago

Hi Racon,

I am trying to use Racon to polish an assembly using my illumina data. I've been able to polish using just the long reads, but I am having issues doing the same with illumina data. I have looked at all previous versions of this issue and I am still having a hard time trying to solve the issue. I saw the latest post suggested using the latest version, but the latest version I can get through conda is 1.4.20 and the school is very strict and limiting with building and installing outside of conda environments.

I have used a previously mentioned script to merge my pe reads into one file, and that didn't give any errors,


racon -m 8 -x -6 -g -8 -w 500 -t 14 ILPE_joined.fastq mapped.sam bridged_contigs.fasta>Racon_polished.fasta

the sam file first 10:
@SQ     SN:bctg00000000 LN:6731957
@SQ     SN:bctg00000001 LN:6356990
@SQ     SN:bctg00000002 LN:6178534
@SQ     SN:bctg00000003 LN:6044776
@SQ     SN:bctg00000004 LN:5947132
@SQ     SN:bctg00000005 LN:5277117
@SQ     SN:bctg00000006 LN:5249221
@SQ     SN:bctg00000007 LN:5077326
@SQ     SN:bctg00000008 LN:4978110
@SQ     SN:bctg00000009 LN:4760805

The merged read file first few lines (NB I am new to this, so the best way I knew to try view this was head and then grep for @): 
@A00721:395:H7GYTDSX3:4:1101:1181:10001
@A00721:395:H7GYTDSX3:4:1101:1524:10001
@A00721:395:H7GYTDSX3:4:1101:1542:10001
@A00721:395:H7GYTDSX3:4:1101:1759:10001
@A00721:395:H7GYTDSX3:4:1101:2139:10001
@A00721:395:H7GYTDSX3:4:1101:2736:10001
@A00721:395:H7GYTDSX3:4:1101:3242:10001
@A00721:395:H7GYTDSX3:4:1101:3821:10001

This is after I just used the script you suggested on a previous post. Please advise
rvaser commented 2 years ago

Hello, please paste the command you run to get the mapped.sam file.

Best regards, Robert

Mondlii commented 2 years ago

Hi, thank you for your response.


The command I used: 

bwa mem -t 8 bridge_contigs/bridged_contigs.fasta /Macrogen_datqa/ILPE_unclassified_1.fastq /Macrogen_datqa/ILPE_unclassified_2.fastq> /nlustre/users/Macrogen_datqa/map_nanopore.sam
rvaser commented 2 years ago

Please run head -n1 ILPE_unclassified_1.fastq and the same for ILPE_unclassified_2.fastq.

Mondlii commented 2 years ago

ILPE_unclassified_1.fastq "@A00721:395:H7GYTDSX3:4:1101:1181:1000 1:N:0:CGGAACTG+TCGTAGTG" ILPE_unclassified_2.fastq

"@A00721:395:H7GYTDSX3:4:1101:1181:1000 2:N:0:CGGAACTG+TCGTAGTG"

rvaser commented 2 years ago

Please rerun bwa-mem with the combined file ILPE_joined.fastq. The problem is that reads after the preprocessing script have different names in .fastq and in your .sam file, and Racon will disregards all overlaps.