ekg / edyeet

base-accurate DNA sequence alignments using edlib and mashmap2
MIT License
33 stars 3 forks source link

Controlling granularity #4

Open egoltsman opened 3 years ago

egoltsman commented 3 years ago

Hi Erik, I'm using edyeet to induce a graph (w seqwish) on a small set of sequences that contain mostly large indels (~4-6kb). It seems like in this case edyeet is trying too hard to do base-level alignment where it should've either terminated or opened a large gap. In the first case below, there is a 5 kb inverted duplication (I know it because it was synthetically introduced) at pos 7544324 on Accn1, but the aligner is attempting to extend the alignment past the breakpoint following the initial ~50kb match. Similarly, in the second case a 5kb inversion occurs at pos 7,573,027, but instead of terminating the alignment, edyeet is pushing through the area of virtually no identity. This leads to tiny graph segments and structures that later get called as bogus variants. I tried raising the -p cutoff to 95%, but that results in the entire 50kb block containing the inversion not being reported. It seems that this cutoff applies across the entire block. Is there anything else you could suggest tweaking that works at a local level, sort of like a gap extension vs mismatch penalty in smith-waterman ? Thanks!

Accn1   75071545        7500000 7550000 +       Accn2   75021975        7490030 7539604 49550   50000   23      id:f:0.99538    ma:i:49550      mm:i:13 ni:i:423        nd:i:11 ns:i:14 ed:i:461        al:i:50011      se:f:0.00921797 cg:Z:44326=10D4998=5I2=7I1=7I1=6I1=1I2=3I1=1I2=1I2=1I2=1X1=1X1I1=1I2=1X1=11I1=8I2=2I1=3I1=7I1=7I1=1I1=2I1=4I1=2I1=1I1=1I1=3I1=3I2=3I1=11I1=1I1=3I1=2I1=2I2=5I1=6I1=9I1=3I1=1I1=1I1=9I1=3I1=6I2=2I2=3I1=1I1=4I3=4I2=3I1=1I1=1I2=2I1=1I3=1I1=1D2=1I1=2I1=3I1=1X2=3I3=3I1=2I4=5I2=1I1=1I1=1I3=4I2=5I2=2I1=2I1=2I3=5I1=5I1=1I2=3I3=13I1=1I2=4I1=3I3=6I1=4I2=2I2=2I1=3I2=5I2=1I1=1I1=2I1=5I5=7I1=1I1=2I1=1I1=1I2=1X5I1=1I1=2I1=2I2=5I2=3I2=7I1=1I3=1I2=13I5=1X2=2I1=2I2=3I3=1X1=2I3=1X3I1=2I2=1I3=3I2=3I1=2I4=3I1=1I2=2I1=3I1=3I2=2I2=1X7I1=1I2=1I1=1I1=6I1=1I1=5I1=5I1=3I1=5I1=1X1=1I1=1I1=2I5=1X2I1=1I2=1I1=2X5I1=1I1=1I1=1I2=1I2=1I1=1I2=14I
Accn1   75071545        7550000 7600000 +       Accn2   75021975        7545041 7594371 47271   50000   14      id:f:0.958261   ma:i:47271      mm:i:1450       ni:i:609        nd:i:609        ns:i:670        ed:i:3338       al:i:50609      se:f:0.0659566cg:Z:23026=1X2D1=1X1=1D1=1I1=2D1=1X1D2=1X1=1D1=1D1=1D1=1X1D3=1X2=1D1=1X1D1=1D1=1X3=1X1D1=1X1D4=1X1=2I3=2D1=1D4=1I3=2X4=4X1=2I1=1I1=1X2=1X2=1I1=1X2=1I2=1I1=1I1=3X1=1X1=2I2=2X2=1X3=3X1D3=2X1=1X1I1=2X1=2X1=1X2I1=2X6=1X1=1X2=1D3=1I1=1X2I1=3I5=3X2=1X1D2=1X2=1X1=1X1D1=1D2=1I1=1I4=1X1=1D1=1D2=2I1=3X1I2=1I1=1D1=1X1=1D1=1X1I1=1D2=1X1D1=1I2=2X4=1X3=2X1I4=2X2I3=2I3=1X1=1X1D2=3X1=1D1=1X1I2=1D2=1I2=1D1=1D1=1I1=1X1=1X1=2D1=1D2=2D6=1X3D1=1X1=2X2=1I1=1X1=1I3=1X1=1X2=1X2D1=2D1=1D3=1X1=1D2=2X1D1=1X1=1I5=1I2=3X3=1X1=1D1=1I2=2X1=1I1=3X1=1X1I1=1X1=1D2=1X1I1=2X2=2X1=1X1=1I1=1X1D1=1I2=1X1I1=1X1=1X1=1X1I2=1X3=2X1=5D4=1X1=2X1=1I6=1X1D1=1I2=2X2=2X2I1=2X2=1X1=1D1=1D1=1X1D1=1I2=1X1I1=2X3=1X2=1X1=1I2=1D1=1I2=1D4=1D1=5D3=2D1=1X1=1X1=1X1=2X2=1X2D1=1D3=1I1=1X3=2X1D2=1X1=1X2=1X1D1=1X2D1=1X4=2X1=3X2D2=1X1=1D2=1X2=1X3=1X1D2=1X1D1=1D1=1X2=1D1=1X1D4=1X1D1=1X2=1D2=2D1=1X2=1I3=2X1I1=2X3=2D3=2X1=1I2=1X1D3=1X1=1X1=2X1D1=1D1=2D2=2X1D2=1D1=1X1=2X1=1I2=1D1=2D1=1D3=1I1=2D3=3X1=2X1D4=1I1=1I2=2I5=1X1=1I2=1X1D3=1X2D2=1X2=2X1D2=2X3=1X1=1X1D1=1X1=2D2=4D1=1X2=1X1=1I2=1D1=2D1=1D2=1X1=2D3=1D3=1D1=1I2=2X1=1X1D1=1X1=2X1=2X1I2=1I1=1X1=1X1=2I3=1D2=2X2=1D3=1I1=1X1=2X2=2D1=1D2=1X1=1X1=1D1=3D1=1D2=2X1=2X1=1X3=1I2=1X1=2X2=2X1=1D1=3X3D1=1D2=1X2=1I1=1X3=1X2=1X2=1X1=2D5=1X1=1X1=1X1D2=1X1=1D1=1X1D4=1I2=1X2=1D4=3X1D2=1D1=1X1=3X1=2X3=2X1I2=2D3=2X1=1X1I2=1X1D1=1I2=1I1=1X1I3=3X1=4X1D1=1D1=1X1=3X3D1=1X3=1X1=2D1=1X3=1X1D1=1X2=1D1=1X1=3X1=1X1=2D1=2X3=1D2=1X2=1X1=1D1=3D1=2D1=1X1D4=1X1D2=1X1=1X1I3=1X2=1X1=1X3=2X1D2=1X1D2=1I2=2X2=1D1=1X1=3X2I1=2X2=1X2=4D5=1D2=1X2=1X1I2=2X1=1X1D3=3X2=1X2=1D2=1X1=1D1=1I1=1X1=1X1=1X1=1X1=1X1I2=1I2=2X3=2I4=1X1=1D4=2I1=1X1I2=1X2=1X2=2X2=1X1=2X1=1X1=1D2=4I2=1I3=1X1D4=1X1=1X2=2I3=1X1=1X1=1D1=1X1=1X2=1I2=1D1=1X1=1X2D3=2D2=1X2=1X1I2=1I1=1X1=1X1I3=3X2=1X1=2X1=1X1=1X1=2D1=1D1=2D1=1D3=2X1D3=1X1=1X2=1D2=1I1=1X1=1I2=1I2=1I1=1D1=1X3=3X1=1X1=1X4=1D2=1X1=3X1=1X1=2X3=1X1=1D2=3D6=1X1=1X1=1X2=1D2=2X1=1X1I1=1D1=1D3=1X2=1X1=3X1=1I2=1I4=1X1=3I1=1I1=1X1=1X1=1X2=1D1=1X1I2=1D3=2D1=1X1=1I1=1D1=1D2=1X2=1X3=1D1=2X1D1=3I2=1X2=1X1=1X1=1I2=1I1=1I1=1X3=1X2=1X1=1X1=1X1=1X1=1X2=1D3=1D1=3I3=1X1=2X2=1X1=2X2=4X2=1X1I1=1X1=1I1=1X1=1X1D1=1I3=2X1=1I1=1I1=1D2=2X1=1X3=1X1=1X1=1X3=1X2D1=2D2=2X2=3X2=1X1D1=1D2=2X2=1X1=1D2=1I2=1D1=1X1D1=2D1=1D2=2X4=1X2D1=1D1=1I2=1X1=1I2=1X2=1I1=1I2=1X4=2D2=1I1=1X1=1X3=2X2=1X2=1D2=1X1=4X1=1I2=1X1=1X1=2X1D2=2D4=1D2=1D2=3D3=2X1D1=1I1=1D2=1X2=2X2=1I1=1I1=1X1=1D1=1X2=1X1=1I3=1X1=1X2=2X2=1X1=2X1D1=1D1=3X1=1I3=1X1=1I2=1I1=1I2=1I2=1X2=1X1=1I4=1I1=1I2=1I1=1X4=3I2=3X2=1D2=1D1=1I1=1X3=1D1=1X3=1X1I2=1I1=1I1=1I2=1X2D1=2D2=1X1=1I2=1X1=1D4=1I1=3X1=2X1=1X1D1=1X2=1X4=1X1=1X1I2=1X1=1I1=1I3=2X2=1X1=2X2D3=1X2D4=3X3=2I1=1X2=1X1=2I2=1X1=1X4=1X1=1X1D2=3X2=1X4=1I1=1X3I2=2X1I2=2X1=1I1=2I5=1X1D1=1X1=1I2=1I2=1I1=1X2=1X1=1I1=1I3=2I2=1X1=2I1=2I2=1I3=3I1=2X1=1I3=1X2I1=1I2=3X2=1X2I1=1X1=1I1=1D4=1I2=1X2I1=1X1=1X1=2I1=1I1=1I4=1X1=1X1=1I1=2I1=3I3=2X2=1I2=1I1=2I3=1X3=1X1=1I2=1X2=1X3D2=1I1=1D2=1X4=6X4=2X2I2=1I4=1X2=2X1D1=1D1=1X1=1D1=1D1=3X3=1X1=1X1I1=1X2=1I1=1X5=4D4=1X1=1D1=1X1=2X1=6X1=2X2=1X1I1=1X4=1X2D6=2X1D4=2X1=1D2=1X1=1X1=2D2=2D1=1X5=2X1I1=1I4=2X4=1X1I2=2X1=1D2=1X1I2=1I1=2D2=2D4=1X2=1X1I1=1X1I1=1I1=1X1=1I1=1I1=1I1=3X3=1I1=1X2=1I1=1I1=1X1=1X1=1X1D4=2X3I5=1X2I4=5X1=1X1=1X2=1X1D4=2X1=4X1=1I1=2X1=1X5=1I1=1X2=1X1=2X1=2X1=4X1D1=2X1=1D1=2X2=1X1=1X1D2=1D1=1D3=1X4=1X1=1D1=1X2=1X1D2=1D2=1D1=2X1D1=2D2=1D1=1X1D1=1X1=1D2=1X1D1=1X1=1D1=2D3=1X2D1=1D1=1D5=1I2=1X1D2=1I1=2X1=2X4=1D1=1X1D3=2D2=1X1=2D2=1X1I1=1X1=2X1I1=2X2=1I3=1X2=3X1=1X2=1X1=1X1I1=1X2=1D2=1X1=3X1=1X4=2X1=6X2=1I1=1X1I2=1D1=3X1=1I1=2D2=1X1=1I1=2X1=1X1=1X1=1X4=1X1D1=1X2=2X1=3X1=2X1D1=1X1D1=1X1=1X1=1X1=3X1=1D1=1X1=1X2=2D2=1X1D1=1D3=1X1I4=1I3=1D1=1X1=1D4=2D2=1X2D1=1D1=1D1=1D4=1X1D1=1D3=1X1=1X1I2=1X1I2=1X1D2=1X2=1I1=1I4=1X1=2X1=1X2=1I1=2X2=2X2=2X1=1D2=1X1=2X1=1X4=1D1=1D2=1X2=1X1I2=1X1D2=2X1=1D4=1X2I4=1I1=1I1=1I1=1X1=1I1=3I5=1X1I1=1I3=1D4=1X1D3=1X2=1I1=3I2=1X1=1X1=1X1=2X1I1=1X1=1X1=1X1=1X1=2X1I1=3X1I1=2X2=1X1=1X1=1I3=1X1=1X1=1X1=2X1=1D1=1X2=2X1=1D1=1X2I1=1I2=1I4=7X4=1X2D1=1D1=3X1=1X2=1I2=1X1=1X1D1=1X2=1X1=3X2=1X3=1D2=2X1=2X1D1=1X1=1X1D2=2I1=1X2=2I3=1X3=2I2=2X1=2X1=1X1=1I2=1D1=1D5=1I1=1X1=3I3=1I1=1I1=1X1I1=1X1=1I2=1X1=1X2I3=2X1I1=3I3=2I2=1X1I2=1X1=1X1=1I4=1X3=1I1=1I2=1X1I1=1X2=2X2=2X1=4X2I1=2X1=2X1=1X2=1X1=1D5=1X1=2X1=1D1=4X1=2X4=1X1I2=1X1=1X1=5X4=1X2D5=2X3D4=1X1=2X1=1I1=1D1=1D2=1X1=1D3=3X1=1D2=1D1=2X1=1D1=2D1=1X1=1D1=1X4=2I2=2X1I2=1X1D3=2X1I2=1X1D4=2X5=2X1D1=1D4=1X1=2I2=1I1=1X1I1=1X3=2X1=1I3=2X1I6=1X2I4=1X1=1X1=1D1=2X1=6X1=2X1=1X1=1I1=1X4=4I5=1X2=2I3=1I2=1X1=1X1I1=1X1D1=1X1D1=1X1=1I1=1X1=1X1=1D1=1X4=1X2=1X2D2=1D2=6X4=1X2=1I1=1D2=1X1=3I1=1X3=1X2=1D1=1X3=2D1=1D2=1D2=2X3=1X2D1=2D2=2D1=1X4=1D1=1D1=2D1=1X1=1X1=1X2D3=1D3=1I1=1D1=1X1=1X2=3X2=1X1=2D2=3D1=1D1=2X1=1D2=2D1=1D2=2X1=1D3=4D5=1D1=1X2=2D1=1D3=1D1=1X1D1=1X1I5=2D2=2X1D2=2X1D2=1X1=4D4=1X2=3X2=1X1=1X1I4=1X1=1X2=2D1=1X2=1X3=2D1=3X4=2X2=1I1=1D2=1D2=3I1=2I3=1X1=1D1=1D2=2X1=1D4=1X2=1X1=2X1=1X1=2X1D1=1X1I4=1I1=1X2=1D1=1X2=2I1=1X2I3=1X1=2D3=2D2=1X1=1I3=1X1=1D1=1I2=1I2=3X3=1X2=1X1=1X1=1D2=1X1=1D3=1I1=1X3=1X1D2=1X2=1X2=2D1=1D3=1X2=5X1=2D1=1D2=2D2=1X1=1X3=1X1=1D2=1X1=1I1=1X2=1D2=2X2=1X1D2=1I1=1D1=2X1I3=3I2=1I2=1I4=2I2=2X1=2X2=1I1=1D1=4X1=1X3=1I1=1X2=2X3=1X1=1X1=1D2=2I4=1X2=1X1=1D2=1D3=1X1D2=1D1=1X2=2X3=2I1=1X1=1I1=1X1I2=1D1=1X3=1D2=1I1=2I3=1X2=4X1=3I2=2I2=1X1I1=3I3=1X1=1X1=1X3=1X1=2X2=1I1=1D2=2X1=1D2=1D1=2X1=1D1=1I1=1X1=1X1D2=4X2=2X1=1X2=2X1=1X3=2X1D1=1D2=1I2=1X1=1X1=1X1=1X1=1X2=1X3=1X2=2D3=2X1=1D2=1X2=3D1=2X2=2I2=1X2=1X2=1I1=1I2=1X1D1=2I3=1I2=1X1D1=1I2=1X1=1X1=1X1=1X1=1D1=1D2=2D2=1D3=3X1=1X1D2=1X3=1I1=1I1=1X1D1=2X2=1I2=1X1=1X1=1X6=3I2=1I1=1X3=2X1=1X1=3X1=1X2=1I4=1X1=1X1=3X3=1X1=1I2=1D1=1D3=1X1D1=1D3=1I1=1X1=1X3=2X1I3=1I1=2I1=1I1=1I1=1X1=2X1=1X1I1=1X2=3X3=1X1D1=1X2=1D1=1X2=1X1D2=1X1=1I2=1X1=1I1=3X1I2=1D1=1X1=1X1=1D2=1I2=1X1=1X3=1X1I4=1X1I4=1D1=4D2=1X1=1I1=2X1=1X2=2X2=1X2=1X2=1X1=2D1=1D4=1X1I4=2D3=2X2=1D2=1X1D1=1X1=1X1=1X1=1X1=1D1=1X1=1I3=1I1=1X2=3X3=1X1I1=2X2=1X1D2=1X2=1I5=3X1=3X1I2=1D1=3X1I1=1I1=1I2=2X3=1D1=1X1I2=2X1I3=1X1=1X2=1X3=1X1D1=1X2=1X1I4=1X1I1=2I2=1X1=1X3I3=1I1=1I3=2X1=1I1=1X1I1=3X1=1X2=1I1=1X1=1X1=1I2=1X2=1X2I3=1X1=5X1=2X1I2=5X1=1X1I1=1D1=1D1=2X3=3I2=1X1D1=2X3=2I2=1I1=1D2=1D1=2X1=3X1=1X1=1I2=3X2=1I2=1I2=1X4=1D2=1X1=2I1=1X2=2X1=1I1=1X5=1X1I1=1I2=1X2=1X3=1X1=1D2=1X2=1I1=3X1I1=2I1=3X2=2X1=1I2=1X3=1D1=2X1=2X2=1X2=4I1=1I1=1X2=1X1=2X2=2X1I5=1D1=1D1=1D3=2X1=2D3=1X1=2I1=1X1=1X1=2X1I2=1D2=1X1=2X2I1=1I3=1I3=1X1I1=1I4=2X1I2=1X1I1=1X1=1I1=4I2=1X1=2X1=2I1=1I3=2X2=2X1I2=1X2=1X2I3=1X1I2=1D1=1X5=2D2=1D1=1D4=3X1=2X1I3=2I3=1D1=1I2=1I1=3X1=2X1=1I1=1I2=2X1I2=1X2=2X3I1=1X1=1I3=1X1I3=1I1=2D4=3X1=2X1=1X1I3=1D2=1X2=2I1=1I2=1X1=1X1I4=1X1I1=1I2=1X1=1X1=2I2=1X2=1I1=1X2=1X3=1X1=1I1=5X1=2I4=1X1=1X1I1=1X2I2=1X1=1X2=2X1I3=1X1=1D3=1X2=1I1=1X1D1=1D2=1X1=3X1=3X3=1I1=1X1D1=1I2=1X1=2X1D1=1X1=1X1=1I1=1I2=1X3=1X1=1I1=3X1I4=1X1=2X1=1X2I1=1X2=1I1=1I1=1X1I1=1I2=1I3=1D1=2X1=1X5=2X1D1=2X1I2=2X1D2=1X1=3X1I1=1I1=1D2=1X2=1D1=1I1=1I1=1I1=2X1=1I1=1I1=1D3=3X1=1X1=3D1=2X2=1D2=1X1=1I2=3X2=1D6=1X1D1=2X2=1X1I1=1I3=1I1=1X1=3I2=1X1I1=1X3=1D1=1X2=1X1=1X1D1=1X1=1X1I1=2I5=2I3=1I1=1X1D1=2X2=2X3=1X1=1D1=1I1=2X1I1=1I1=2I2=2X1=1I3=2D3=2X2D4=2X1D3=1X4=2X2=1D1=1X1I2=1I1=1X1D2=1X1I1=1I2=1X1=1X1I1=2X1=2X1=2D1=1D5=2D2=1X2=1X2I2=1X2=1X1=1X1=2X1I5=1D1=1X4D1=1D3=1I2=1X1=1X6=2X1=1X2D1=2X1=2X1=1X1D1=2X3=3X1=1X2=1I2=2X2=2D1=1X1=1I1=1X1D1=1D1=1D2=1D2=1X1=1D2=1X2=1X1=1D1=2D1=4X4=2X3=1D4=1I2=2I2=2D1=1X4=2X2=2I2=1X2=1X1I1=1I1=1I1=1X3=1X1I2=1X1I3=1X1=1I1=3X2=4I1=1X2I21303=670I
ekg commented 3 years ago

The linear gap model in edyeet is probably not suitable for this target. If you apply it as part of pggb, with smoothxg afterwards, it should be fine. But directly inducing the graph with seqwish is going to generate a mess if that's the only step.

I would suggest using wfmash. There you may find a need to increase the -l and -a parameters or turn of adaptive banding to get the best alignment.

On Fri, Oct 9, 2020, 21:57 Eugene Goltsman notifications@github.com wrote:

Hi Erik, I'm using edyeet to induce a graph (w seqwish) on a small set of sequences that contain mostly large indels (~4-6kb). It seems like in this case edyeet is trying too hard to do base-level alignment where it should've either terminated or opened a large gap. In the first case below, there is a 5 kb inverted duplication (I know it because it was synthetically introduced) at pos 7544324 on Accn1, but the aligner is attempting to extend the alignment past the breakpoint following the initial ~50kb match. Similarly, in the second case a 5kb inversion occurs at pos 7,573,027, but instead of terminating the alignment, edyeet is pushing through the area of virtually no identity. This leads to tiny graph segments and structures that later get called as bogus variants. I tried raising the -p cutoff to 95%, but that results in the entire 50kb block containing the inversion not being reported. It seems that this cutoff applies across the entire block. Is there anything else you could suggest tweaking that works at a local level, sort of like a gap extension vs mismatch penalty in smith-waterman ? Thanks!

Accn1 75071545 7500000 7550000 + Accn2 75021975 7490030 7539604 49550 50000 23 id:f:0.99538 ma:i:49550 mm:i:13 ni:i:423 nd:i:11 ns:i:14 ed:i:461 al:i:50011 se:f:0.00921797 cg:Z:44326=10D4998=5I2=7I1=7I1=6I1=1I2=3I1=1I2=1I2=1I2=1X1=1X1I1=1I2=1X1=11I1=8I2=2I1=3I1=7I1=7I1=1I1=2I1=4I1=2I1=1I1=1I1=3I1=3I2=3I1=11I1=1I1=3I1=2I1=2I2=5I1=6I1=9I1=3I1=1I1=1I1=9I1=3I1=6I2=2I2=3I1=1I1=4I3=4I2=3I1=1I1=1I2=2I1=1I3=1I1=1D2=1I1=2I1=3I1=1X2=3I3=3I1=2I4=5I2=1I1=1I1=1I3=4I2=5I2=2I1=2I1=2I3=5I1=5I1=1I2=3I3=13I1=1I2=4I1=3I3=6I1=4I2=2I2=2I1=3I2=5I2=1I1=1I1=2I1=5I5=7I1=1I1=2I1=1I1=1I2=1X5I1=1I1=2I1=2I2=5I2=3I2=7I1=1I3=1I2=13I5=1X2=2I1=2I2=3I3=1X1=2I3=1X3I1=2I2=1I3=3I2=3I1=2I4=3I1=1I2=2I1=3I1=3I2=2I2=1X7I1=1I2=1I1=1I1=6I1=1I1=5I1=5I1=3I1=5I1=1X1=1I1=1I1=2I5=1X2I1=1I2=1I1=2X5I1=1I1=1I1=1I2=1I2=1I1=1I2=14I Accn1 75071545 7550000 7600000 + Accn2 75021975 7545041 7594371 47271 50000 14 id:f:0.958261 ma:i:47271 mm:i:1450 ni:i:609 nd:i:609 ns:i:670 ed:i:3338 al:i:50609 se:f:0.0659566cg:Z:23026=1X2D1=1X1=1D1=1I1=2D1=1X1D2=1X1=1D1=1D1=1D1=1X1D3=1X2=1D1=1X1D1=1D1=1X3=1X1D1=1X1D4=1X1=2I3=2D1=1D4=1I3=2X4=4X1=2I1=1I1=1X2=1X2=1I1=1X2=1I2=1I1=1I1=3X1=1X1=2I2=2X2=1X3=3X1D3=2X1=1X1I1=2X1=2X1=1X2I1=2X6=1X1=1X2=1D3=1I1=1X2I1=3I5=3X2=1X1D2=1X2=1X1=1X1D1=1D2=1I1=1I4=1X1=1D1=1D2=2I1=3X1I2=1I1=1D1=1X1=1D1=1X1I1=1D2=1X1D1=1I2=2X4=1X3=2X1I4=2X2I3=2I3=1X1=1X1D2=3X1=1D1=1X1I2=1D2=1I2=1D1=1D1=1I1=1X1=1X1=2D1=1D2=2D6=1X3D1=1X1=2X2=1I1=1X1=1I3=1X1=1X2=1X2D1=2D1=1D3=1X1=1D2=2X1D1=1X1=1I5=1I2=3X3=1X1=1D1=1I2=2X1=1I1=3X1=1X1I1=1X1=1D2=1X1I1=2X2=2X1=1X1=1I1=1X1D1=1I2=1X1I1=1X1=1X1=1X1I2=1X3=2X1=5D4=1X1=2X1=1I6=1X1D1=1I2=2X2=2X2I1=2X2=1X1=1D1=1D1=1X1D1=1I2=1X1I1=2X3=1X2=1X1=1I2=1D1=1I2=1D4=1D1=5D3=2D1=1X1=1X1=1X1=2X2=1X2D1=1D3=1I1=1X3=2X1D2=1X1=1X2=1X1D1=1X2D1=1X4=2X1=3X2D2=1X1=1D2=1X2=1X3=1X1D2=1X1D1=1D1=1X2=1D1=1X1D4=1X1D1=1X2=1D2=2D1=1X2=1I3=2X1I1=2X3=2D3=2X1=1I2=1X1D3=1X1=1X1=2X1D1=1D1=2D2=2X1D2=1D1=1X1=2X1=1I2=1D1=2D1=1D3=1I1=2D3=3X1=2X1D4=1I1=1I2=2I5=1X1=1I2=1X1D3=1X2D2=1X2=2X1D2=2X3=1X1=1X1D1=1X1=2D2=4D1=1X2=1X1=1I2=1D1=2D1=1D2=1X1=2D3=1D3=1D1=1I2=2X1=1X1D1=1X1=2X1=2X1I2=1I1=1X1=1X1=2I3=1D2=2X2=1D3=1I1=1X1=2X2=2D1=1D2=1X1=1X1=1D1=3D1=1D2=2X1=2X1=1X3=1I2=1X1=2X2=2X1=1D1=3X3D1=1D2=1X2=1I1=1X3=1X2=1X2=1X1=2D5=1X1=1X1=1X1D2=1X1=1D1=1X1D4=1I2=1X2=1D4=3X1D2=1D1=1X1=3X1=2X3=2X1I2=2D3=2X1=1X1I2=1X1D1=1I2=1I1=1X1I3=3X1=4X1D1=1D1=1X1=3X3D1=1X3=1X1=2D1=1X3=1X1D1=1X2=1D1=1X1=3X1=1X1=2D1=2X3=1D2=1X2=1X1=1D1=3D1=2D1=1X1D4=1X1D2=1X1=1X1I3=1X2=1X1=1X3=2X1D2=1X1D2=1I2=2X2=1D1=1X1=3X2I1=2X2=1X2=4D5=1D2=1X2=1X1I2=2X1=1X1D3=3X2=1X2=1D2=1X1=1D1=1I1=1X1=1X1=1X1=1X1=1X1I2=1I2=2X3=2I4=1X1=1D4=2I1=1X1I2=1X2=1X2=2X2=1X1=2X1=1X1=1D2=4I2=1I3=1X1D4=1X1=1X2=2I3=1X1=1X1=1D1=1X1=1X2=1I2=1D1=1X1=1X2D3=2D2=1X2=1X1I2=1I1=1X1=1X1I3=3X2=1X1=2X1=1X1=1X1=2D1=1D1=2D1=1D3=2X1D3=1X1=1X2=1D2=1I1=1X1=1I2=1I2=1I1=1D1=1X3=3X1=1X1=1X4=1D2=1X1=3X1=1X1=2X3=1X1=1D2=3D6=1X1=1X1=1X2=1D2=2X1=1X1I1=1D1=1D3=1X2=1X1=3X1=1I2=1I4=1X1=3I1=1I1=1X1=1X1=1X2=1D1=1X1I2=1D3=2D1=1X1=1I1=1D1=1D2=1X2=1X3=1D1=2X1D1=3I2=1X2=1X1=1X1=1I2=1I1=1I1=1X3=1X2=1X1=1X1=1X1=1X1=1X2=1D3=1D1=3I3=1X1=2X2=1X1=2X2=4X2=1X1I1=1X1=1I1=1X1=1X1D1=1I3=2X1=1I1=1I1=1D2=2X1=1X3=1X1=1X1=1X3=1X2D1=2D2=2X2=3X2=1X1D1=1D2=2X2=1X1=1D2=1I2=1D1=1X1D1=2D1=1D2=2X4=1X2D1=1D1=1I2=1X1=1I2=1X2=1I1=1I2=1X4=2D2=1I1=1X1=1X3=2X2=1X2=1D2=1X1=4X1=1I2=1X1=1X1=2X1D2=2D4=1D2=1D2=3D3=2X1D1=1I1=1D2=1X2=2X2=1I1=1I1=1X1=1D1=1X2=1X1=1I3=1X1=1X2=2X2=1X1=2X1D1=1D1=3X1=1I3=1X1=1I2=1I1=1I2=1I2=1X2=1X1=1I4=1I1=1I2=1I1=1X4=3I2=3X2=1D2=1D1=1I1=1X3=1D1=1X3=1X1I2=1I1=1I1=1I2=1X2D1=2D2=1X1=1I2=1X1=1D4=1I1=3X1=2X1=1X1D1=1X2=1X4=1X1=1X1I2=1X1=1I1=1I3=2X2=1X1=2X2D3=1X2D4=3X3=2I1=1X2=1X1=2I2=1X1=1X4=1X1=1X1D2=3X2=1X4=1I1=1X3I2=2X1I2=2X1=1I1=2I5=1X1D1=1X1=1I2=1I2=1I1=1X2=1X1=1I1=1I3=2I2=1X1=2I1=2I2=1I3=3I1=2X1=1I3=1X2I1=1I2=3X2=1X2I1=1X1=1I1=1D4=1I2=1X2I1=1X1=1X1=2I1=1I1=1I4=1X1=1X1=1I1=2I1=3I3=2X2=1I2=1I1=2I3=1X3=1X1=1I2=1X2=1X3D2=1I1=1D2=1X4=6X4=2X2I2=1I4=1X2=2X1D1=1D1=1X1=1D1=1D1=3X3=1X1=1X1I1=1X2=1I1=1X5=4D4=1X1=1D1=1X1=2X1=6X1=2X2=1X1I1=1X4=1X2D6=2X1D4=2X1=1D2=1X1=1X1=2D2=2D1=1X5=2X1I1=1I4=2X4=1X1I2=2X1=1D2=1X1I2=1I1=2D2=2D4=1X2=1X1I1=1X1I1=1I1=1X1=1I1=1I1=1I1=3X3=1I1=1X2=1I1=1I1=1X1=1X1=1X1D4=2X3I5=1X2I4=5X1=1X1=1X2=1X1D4=2X1=4X1=1I1=2X1=1X5=1I1=1X2=1X1=2X1=2X1=4X1D1=2X1=1D1=2X2=1X1=1X1D2=1D1=1D3=1X4=1X1=1D1=1X2=1X1D2=1D2=1D1=2X1D1=2D2=1D1=1X1D1=1X1=1D2=1X1D1=1X1=1D1=2D3=1X2D1=1D1=1D5=1I2=1X1D2=1I1=2X1=2X4=1D1=1X1D3=2D2=1X1=2D2=1X1I1=1X1=2X1I1=2X2=1I3=1X2=3X1=1X2=1X1=1X1I1=1X2=1D2=1X1=3X1=1X4=2X1=6X2=1I1=1X1I2=1D1=3X1=1I1=2D2=1X1=1I1=2X1=1X1=1X1=1X4=1X1D1=1X2=2X1=3X1=2X1D1=1X1D1=1X1=1X1=1X1=3X1=1D1=1X1=1X2=2D2=1X1D1=1D3=1X1I4=1I3=1D1=1X1=1D4=2D2=1X2D1=1D1=1D1=1D4=1X1D1=1D3=1X1=1X1I2=1X1I2=1X1D2=1X2=1I1=1I4=1X1=2X1=1X2=1I1=2X2=2X2=2X1=1D2=1X1=2X1=1X4=1D1=1D2=1X2=1X1I2=1X1D2=2X1=1D4=1X2I4=1I1=1I1=1I1=1X1=1I1=3I5=1X1I1=1I3=1D4=1X1D3=1X2=1I1=3I2=1X1=1X1=1X1=2X1I1=1X1=1X1=1X1=1X1=2X1I1=3X1I1=2X2=1X1=1X1=1I3=1X1=1X1=1X1=2X1=1D1=1X2=2X1=1D1=1X2I1=1I2=1I4=7X4=1X2D1=1D1=3X1=1X2=1I2=1X1=1X1D1=1X2=1X1=3X2=1X3=1D2=2X1=2X1D1=1X1=1X1D2=2I1=1X2=2I3=1X3=2I2=2X1=2X1=1X1=1I2=1D1=1D5=1I1=1X1=3I3=1I1=1I1=1X1I1=1X1=1I2=1X1=1X2I3=2X1I1=3I3=2I2=1X1I2=1X1=1X1=1I4=1X3=1I1=1I2=1X1I1=1X2=2X2=2X1=4X2I1=2X1=2X1=1X2=1X1=1D5=1X1=2X1=1D1=4X1=2X4=1X1I2=1X1=1X1=5X4=1X2D5=2X3D4=1X1=2X1=1I1=1D1=1D2=1X1=1D3=3X1=1D2=1D1=2X1=1D1=2D1=1X1=1D1=1X4=2I2=2X1I2=1X1D3=2X1I2=1X1D4=2X5=2X1D1=1D4=1X1=2I2=1I1=1X1I1=1X3=2X1=1I3=2X1I6=1X2I4=1X1=1X1=1D1=2X1=6X1=2X1=1X1=1I1=1X4=4I5=1X2=2I3=1I2=1X1=1X1I1=1X1D1=1X1D1=1X1=1I1=1X1=1X1=1D1=1X4=1X2=1X2D2=1D2=6X4=1X2=1I1=1D2=1X1=3I1=1X3=1X2=1D1=1X3=2D1=1D2=1D2=2X3=1X2D1=2D2=2D1=1X4=1D1=1D1=2D1=1X1=1X1=1X2D3=1D3=1I1=1D1=1X1=1X2=3X2=1X1=2D2=3D1=1D1=2X1=1D2=2D1=1D2=2X1=1D3=4D5=1D1=1X2=2D1=1D3=1D1=1X1D1=1X1I5=2D2=2X1D2=2X1D2=1X1=4D4=1X2=3X2=1X1=1X1I4=1X1=1X2=2D1=1X2=1X3=2D1=3X4=2X2=1I1=1D2=1D2=3I1=2I3=1X1=1D1=1D2=2X1=1D4=1X2=1X1=2X1=1X1=2X1D1=1X1I4=1I1=1X2=1D1=1X2=2I1=1X2I3=1X1=2D3=2D2=1X1=1I3=1X1=1D1=1I2=1I2=3X3=1X2=1X1=1X1=1D2=1X1=1D3=1I1=1X3=1X1D2=1X2=1X2=2D1=1D3=1X2=5X1=2D1=1D2=2D2=1X1=1X3=1X1=1D2=1X1=1I1=1X2=1D2=2X2=1X1D2=1I1=1D1=2X1I3=3I2=1I2=1I4=2I2=2X1=2X2=1I1=1D1=4X1=1X3=1I1=1X2=2X3=1X1=1X1=1D2=2I4=1X2=1X1=1D2=1D3=1X1D2=1D1=1X2=2X3=2I1=1X1=1I1=1X1I2=1D1=1X3=1D2=1I1=2I3=1X2=4X1=3I2=2I2=1X1I1=3I3=1X1=1X1=1X3=1X1=2X2=1I1=1D2=2X1=1D2=1D1=2X1=1D1=1I1=1X1=1X1D2=4X2=2X1=1X2=2X1=1X3=2X1D1=1D2=1I2=1X1=1X1=1X1=1X1=1X2=1X3=1X2=2D3=2X1=1D2=1X2=3D1=2X2=2I2=1X2=1X2=1I1=1I2=1X1D1=2I3=1I2=1X1D1=1I2=1X1=1X1=1X1=1X1=1D1=1D2=2D2=1D3=3X1=1X1D2=1X3=1I1=1I1=1X1D1=2X2=1I2=1X1=1X1=1X6=3I2=1I1=1X3=2X1=1X1=3X1=1X2=1I4=1X1=1X1=3X3=1X1=1I2=1D1=1D3=1X1D1=1D3=1I1=1X1=1X3=2X1I3=1I1=2I1=1I1=1I1=1X1=2X1=1X1I1=1X2=3X3=1X1D1=1X2=1D1=1X2=1X1D2=1X1=1I2=1X1=1I1=3X1I2=1D1=1X1=1X1=1D2=1I2=1X1=1X3=1X1I4=1X1I4=1D1=4D2=1X1=1I1=2X1=1X2=2X2=1X2=1X2=1X1=2D1=1D4=1X1I4=2D3=2X2=1D2=1X1D1=1X1=1X1=1X1=1X1=1D1=1X1=1I3=1I1=1X2=3X3=1X1I1=2X2=1X1D2=1X2=1I5=3X1=3X1I2=1D1=3X1I1=1I1=1I2=2X3=1D1=1X1I2=2X1I3=1X1=1X2=1X3=1X1D1=1X2=1X1I4=1X1I1=2I2=1X1=1X3I3=1I1=1I3=2X1=1I1=1X1I1=3X1=1X2=1I1=1X1=1X1=1I2=1X2=1X2I3=1X1=5X1=2X1I2=5X1=1X1I1=1D1=1D1=2X3=3I2=1X1D1=2X3=2I2=1I1=1D2=1D1=2X1=3X1=1X1=1I2=3X2=1I2=1I2=1X4=1D2=1X1=2I1=1X2=2X1=1I1=1X5=1X1I1=1I2=1X2=1X3=1X1=1D2=1X2=1I1=3X1I1=2I1=3X2=2X1=1I2=1X3=1D1=2X1=2X2=1X2=4I1=1I1=1X2=1X1=2X2=2X1I5=1D1=1D1=1D3=2X1=2D3=1X1=2I1=1X1=1X1=2X1I2=1D2=1X1=2X2I1=1I3=1I3=1X1I1=1I4=2X1I2=1X1I1=1X1=1I1=4I2=1X1=2X1=2I1=1I3=2X2=2X1I2=1X2=1X2I3=1X1I2=1D1=1X5=2D2=1D1=1D4=3X1=2X1I3=2I3=1D1=1I2=1I1=3X1=2X1=1I1=1I2=2X1I2=1X2=2X3I1=1X1=1I3=1X1I3=1I1=2D4=3X1=2X1=1X1I3=1D2=1X2=2I1=1I2=1X1=1X1I4=1X1I1=1I2=1X1=1X1=2I2=1X2=1I1=1X2=1X3=1X1=1I1=5X1=2I4=1X1=1X1I1=1X2I2=1X1=1X2=2X1I3=1X1=1D3=1X2=1I1=1X1D1=1D2=1X1=3X1=3X3=1I1=1X1D1=1I2=1X1=2X1D1=1X1=1X1=1I1=1I2=1X3=1X1=1I1=3X1I4=1X1=2X1=1X2I1=1X2=1I1=1I1=1X1I1=1I2=1I3=1D1=2X1=1X5=2X1D1=2X1I2=2X1D2=1X1=3X1I1=1I1=1D2=1X2=1D1=1I1=1I1=1I1=2X1=1I1=1I1=1D3=3X1=1X1=3D1=2X2=1D2=1X1=1I2=3X2=1D6=1X1D1=2X2=1X1I1=1I3=1I1=1X1=3I2=1X1I1=1X3=1D1=1X2=1X1=1X1D1=1X1=1X1I1=2I5=2I3=1I1=1X1D1=2X2=2X3=1X1=1D1=1I1=2X1I1=1I1=2I2=2X1=1I3=2D3=2X2D4=2X1D3=1X4=2X2=1D1=1X1I2=1I1=1X1D2=1X1I1=1I2=1X1=1X1I1=2X1=2X1=2D1=1D5=2D2=1X2=1X2I2=1X2=1X1=1X1=2X1I5=1D1=1X4D1=1D3=1I2=1X1=1X6=2X1=1X2D1=2X1=2X1=1X1D1=2X3=3X1=1X2=1I2=2X2=2D1=1X1=1I1=1X1D1=1D1=1D2=1D2=1X1=1D2=1X2=1X1=1D1=2D1=4X4=2X3=1D4=1I2=2I2=2D1=1X4=2X2=2I2=1X2=1X1I1=1I1=1I1=1X3=1X1I2=1X1I3=1X1=1I1=3X2=4I1=1X2I21303=670I

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ekg/edyeet/issues/4, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEOBID2QXNMPZYTVR2DSJ5TJLANCNFSM4SKPGLXA .

egoltsman commented 3 years ago

Thanks! But what is different about running pggb vs running edyeet followed by seqwish and smoothxg? From the description on the pggb page it seems like it runs these three tools precisely.

ekg commented 3 years ago

Oh it's the same thing. I was just referring to the whole process. The seqwish graph is very "literal" in representing the raw alignments. That can make it hard to work with. You know this though.

On Sat, Oct 10, 2020, 07:32 Eugene Goltsman notifications@github.com wrote:

Thanks! But what is different about running pggb vs running edyeet followed by seqwish and smoothxg? From the description on the pggb page it seems like it runs these three tools precisely.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ekg/edyeet/issues/4#issuecomment-706491691, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEPKGUWQWVIYGUD6O7TSJ7WYJANCNFSM4SKPGLXA .

egoltsman commented 3 years ago

Ok, gotcha. Are there perhaps more aggressive smoothing setting in smoothxg that you would tweak to help get this properly represented in the graph?