pmelsted / bifrost

Bifrost: Highly parallel construction and indexing of colored and compacted de Bruijn graphs
BSD 2-Clause "Simplified" License
204 stars 25 forks source link

Feature request: Represent assembly gaps using Jump records #85

Closed bredeson closed 4 months ago

bredeson commented 4 months ago

Hi @pmelsted, Starting in GFA v1.2, assembly gaps (N strings) can now be represented using Jump J GFA record lines. I would really love to see this supported, as nearly all mid-sized and large-sized genomes contain assembly gaps. As far as I can tell, Bifrost omits references containing assembly gaps from colored pangenome graphs? Is my understanding correct?

Best, Jessen

bredeson commented 4 months ago

Nevermind, i did some more experimenting and I think the above must not be true. My initial tests were performed with sequences that were identical except one had an assembly gap, so Bifrost output only one segment line, representing the ungapped sequence. I misled myself. :-|