PacificBiosciences / FALCON_unzip

Making diploid assembly becomes common practice for genomic study
BSD 3-Clause Clear License
30 stars 18 forks source link

Htg asm refactor #3

Closed pb-jchin closed 9 years ago

pb-jchin commented 9 years ago

significant workflow change. Instead of doing phasing and haplotig-assembly contig-by-contig, we do phasing for all contig and then haplotig-assembly for all contigs at once. We only have to scan the overlap files (*.las files) once this way.

pb-cdunn commented 9 years ago

(I said I'd take a look at this.)

We could refactor this a bit, but the code is already pretty readable to me. (That's an advantage of pypeflow, and of the script-oriented workflow.) We could add some tests for the complicated bits, but I don't want to slow you down.

pb-jchin commented 9 years ago

Yes. I hope the current code capture the right (or at least, a working) structure for the data flow. Based upon this, we can start to work to make it more smooth for satisfying various future requirements.