Closed jagos01 closed 3 years ago
Hi Scott, not sure what happened, could you please send me the raven.cereal file via mail?
Best regards, Robert
Below is the multimer plasmid you were talking about. It only consists of 3 reads, but the multiplicity is 8. Can you please check whether there is a read longer than 12kb mapping fully to this sequence? You can extract all sequences from GFA with awk '$1 ~/S/ {print ">"$2"\n"$3}' graph.gfa > seqs.fa
and then find this weird plasmid with `grep ">Utg1074" -A1 seqs.fa > plasmid.fa".
Yes there are several reads longer than 12Kb mapping to the plasmid. I have re-basecalled the data with the guppy v4.4.2. Depending on how I demultiplex (guppy_barcoder or qcat) I either get 2 circular contigs (chromosome and large plasmid) or 2 circular contigs and one linear contig. The linear contig is still larger than 12kb (~56kb or 64 kb). I have not extracted the sequences yet but suspect reads longer than 12kb will map to the linear contig.
Are those longer reads sequencing artefacts? Should there be only one circular plasmid of 12kbp?
Hello Robert, Correct, there should only be one 12kb plasmid so the longer reads must be artifacts. I will look at some of the reads and see if they can be filtered out. Thanks for your help. Scott
On Tue., Feb. 23, 2021, 3:17 a.m. Robert Vaser, notifications@github.com wrote:
Are those longer reads sequencing artefacts? Should there be only one circular plasmid of 12kbp?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/lbcb-sci/raven/issues/40#issuecomment-784078918, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALWFTRH4TMT37MYYHRAW2NDTAN6CNANCNFSM4X7B3LJA .
Hello Robert, I am using Raven v1.4.0 to assemble a bacterial genome (using nanopore data) which contains 1 chromosome, 1 large plasmid and 1 small plasmid. When I view the gfa file in bandage, I can see 3 circular contigs are generated. The chromosome and large plasmid are the correct size but the small plasmid (~12kb) seems to assembly as a multimer (96.6kb) and is not output to the fasta file. Do you have any suggestions that might allow this plasmid to assembly correctly? A large percentage of the reads (~30%) for this dataset map to the plasmid. Thanks, Scott