ComparativeGenomicsToolkit / hal

Hierarchical Alignment Format
Other
164 stars 39 forks source link

hal2maf refgap problems #174

Open diekhans opened 4 years ago

diekhans commented 4 years ago

This is a ticket to collect information about the problems with hal2maf refgap not correctly orienting inserted sequences.

diekhans commented 4 years ago

joel
I think the best strategy is to document that the column iterator may not return columns in a "nice" order. Then the hal2maf code buffers columns until some "end" point is reached (reference segment end should be safe). Then stitch them into the proper order when emitting the block. It's just too hard to try to "fix" the order in the column iterator so that you can construct a valid maf block by appending columns together blindly

When nesting you have to do these crazy acrobatics, and need to consider whether you are traversing "up" or "down" and whether you're in + or - orientation, etc. The columns are all correct, just emitted at weird

benedictpaten I think that's right @joel, of course one could implement a buffered iterator that works on top of the basic iterator that does what you say? It depends where you want to put the logic joel Ah yeah, that's a good point, it could just be an extra layer and not MAF-specific

glennhickey Haven't been following too closely, but like the idea of a buffered iterator. Especially as it would simplify mulithreading (provided a back end that supports it, of course).