Currently, the code only try to find the read-to-config-mapping for reads longer than length cutoff. For lower coverage data, we might want to use as many reads as possible for phasing. One way to do that is the scan the reads in "b-col" (every read in the database) in the LA4Falcon output rather than the reads in a-col that only has the reads longer than the thresholds. However, since the b-col is not sorted, we need a different mechanism to keep the data in memory to get the results.
Currently, the code only try to find the read-to-config-mapping for reads longer than length cutoff. For lower coverage data, we might want to use as many reads as possible for phasing. One way to do that is the scan the reads in "b-col" (every read in the database) in the
LA4Falcon
output rather than the reads ina-col
that only has the reads longer than the thresholds. However, since theb-col
is not sorted, we need a different mechanism to keep the data in memory to get the results.