PacificBiosciences / FALCON_unzip

Making diploid assembly becomes common practice for genomic study
BSD 3-Clause Clear License
30 stars 18 forks source link

Update read to contig mapping identification #17

Open pb-jchin opened 8 years ago

pb-jchin commented 8 years ago

Currently, the code only try to find the read-to-config-mapping for reads longer than length cutoff. For lower coverage data, we might want to use as many reads as possible for phasing. One way to do that is the scan the reads in "b-col" (every read in the database) in the LA4Falcon output rather than the reads in a-col that only has the reads longer than the thresholds. However, since the b-col is not sorted, we need a different mechanism to keep the data in memory to get the results.