Apparently, stretches of sequence between masked regions, which apparently have anchors, can get pulled apart and end up as unaligned compenents in the final graph. This only happens in a couple chromosomes, but adds up to a lot of sequence overall. clip-vg is changed here to catch them and filter them out. Also add -u option, which in hindsight, makes more sense than using a bed -- it just pops out unaligned stretches of more than K bases which is really what we want. I think.
Apparently, stretches of sequence between masked regions, which apparently have anchors, can get pulled apart and end up as unaligned compenents in the final graph. This only happens in a couple chromosomes, but adds up to a lot of sequence overall.
clip-vg
is changed here to catch them and filter them out. Also add-u
option, which in hindsight, makes more sense than using a bed -- it just pops out unaligned stretches of more than K bases which is really what we want. I think.