Closed viking-sudo-rm closed 4 months ago
Implemented in 3214527.
This sped up the traversal of the Pile CDAWG compared to the Python implementation, but it is still slow (estimated by TQDM to take 400 hours). Randomly accessing disk seems to be a real bottleneck.
This is a feature required to collect data for design of #99.
The original Python implementation was very slow (would take ~80 days).
This is way to slow, especially because
cdawg.node_count()
, which implements very similar logic, takes just a couple minutes. To make it faster, could make the following improvements:Vector<bool>
orbool[]
of lengthcdawg.node_count()