Closed mariaelf97 closed 1 year ago
Is it length zero only in the reference coordinates (that’s expected) or also in the query coordinates?
On Mon, Dec 5, 2022 at 12:33 PM Maryam Ahmadi J @.***> wrote:
Hello,
I have been running assemblytics on some assemblies with a reference genome and I noticed some isolates with multiple SVs in one region, have insertion sequence that has length of zero. I wonder if you ran into the same issue?
— Reply to this email directly, view it on GitHub https://github.com/MariaNattestad/Assemblytics/issues/52, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4W4PN7JIO7YJMEDB4HUZTWLZGR5ANCNFSM6AAAAAASUWLCPA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Is it length zero only in the reference coordinates (that’s expected) or also in the query coordinates? … On Mon, Dec 5, 2022 at 12:33 PM Maryam Ahmadi J @.> wrote: Hello, I have been running assemblytics on some assemblies with a reference genome and I noticed some isolates with multiple SVs in one region, have insertion sequence that has length of zero. I wonder if you ran into the same issue? — Reply to this email directly, view it on GitHub <#52>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4W4PN7JIO7YJMEDB4HUZTWLZGR5ANCNFSM6AAAAAASUWLCPA . You are receiving this because you are subscribed to this thread.Message ID: @.>
I thought the .bed output refers to whatever coordinates that is passed to nucmer first. So in our case it's final.fasta which is the query genome.
(sorry I was reading your question from the email where it didn't include
your edit that showed the .bed entry, but now I can see it)
The first fasta passed to nucmer is the reference, as shown in the code
snippet on assemblytics.com: nucmer -maxmatch -l 100 -c 500 REFERENCE.fa ASSEMBLY.fa -prefix OUT
.
"assembly" here is the "query". The .bed coordinates are always referring
to the reference, so all insertions are length 0 in the reference
coordinates, and that is on purpose.
On Mon, Dec 5, 2022 at 4:13 PM Maryam Ahmadi J @.***> wrote:
Is it length zero only in the reference coordinates (that’s expected) or also in the query coordinates? … <#m_-1433580249179658236_m815482317831939022> On Mon, Dec 5, 2022 at 12:33 PM Maryam Ahmadi J @.> wrote: Hello, I have been running assemblytics on some assemblies with a reference genome and I noticed some isolates with multiple SVs in one region, have insertion sequence that has length of zero. I wonder if you ran into the same issue? — Reply to this email directly, view it on GitHub <#52 https://github.com/MariaNattestad/Assemblytics/issues/52>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4W4PN7JIO7YJMEDB4HUZTWLZGR5ANCNFSM6AAAAAASUWLCPA https://github.com/notifications/unsubscribe-auth/AB4W4PN7JIO7YJMEDB4HUZTWLZGR5ANCNFSM6AAAAAASUWLCPA . You are receiving this because you are subscribed to this thread.Message ID: @.>
I thought the .bed output refers to whatever coordinates that is passed to nucmer first. So in our case it's final.fasta which is the query genome.
— Reply to this email directly, view it on GitHub https://github.com/MariaNattestad/Assemblytics/issues/52#issuecomment-1338438933, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4W4PLYNUA6TWOGJD3JXSTWL2AL3ANCNFSM6AAAAAASUWLCPA . You are receiving this because you commented.Message ID: @.***>
(sorry I was reading your question from the email where it didn't include your edit that showed the .bed entry, but now I can see it) The first fasta passed to nucmer is the reference, as shown in the code snippet on assemblytics.com:
nucmer -maxmatch -l 100 -c 500 REFERENCE.fa ASSEMBLY.fa -prefix OUT
. "assembly" here is the "query". The .bed coordinates are always referring to the reference, so all insertions are length 0 in the reference coordinates, and that is on purpose. On Mon, Dec 5, 2022 at 4:13 PM Maryam Ahmadi J @.> wrote: … Is it length zero only in the reference coordinates (that’s expected) or also in the query coordinates? … <#m_-1433580249179658236_m815482317831939022> On Mon, Dec 5, 2022 at 12:33 PM Maryam Ahmadi J @.> wrote: Hello, I have been running assemblytics on some assemblies with a reference genome and I noticed some isolates with multiple SVs in one region, have insertion sequence that has length of zero. I wonder if you ran into the same issue? — Reply to this email directly, view it on GitHub <#52 <#52>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4W4PN7JIO7YJMEDB4HUZTWLZGR5ANCNFSM6AAAAAASUWLCPA https://github.com/notifications/unsubscribe-auth/AB4W4PN7JIO7YJMEDB4HUZTWLZGR5ANCNFSM6AAAAAASUWLCPA . You are receiving this because you are subscribed to this thread.Message ID: @.> I thought the .bed output refers to whatever coordinates that is passed to nucmer first. So in our case it's final.fasta which is the query genome. — Reply to this email directly, view it on GitHub <#52 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4W4PLYNUA6TWOGJD3JXSTWL2AL3ANCNFSM6AAAAAASUWLCPA . You are receiving this because you commented.Message ID: @.>
That makes sense. Thank you!
Hello,
I have been running assemblytics on some assemblies with a reference genome and I noticed some isolates with multiple SVs in one region, have insertion sequence that has length of zero. I wonder if you ran into the same issue?
These are the steps I used :
insertion reported in
1 3932775 3932775 Assemblytics_w_4 75 + Insertion 0 75 1|quiver|quiver|quiver:3936094-3936169:+ within_alignment
the length of insertion is zero