"I just realised something that could potentially speed up the mapping. Those ranks you need for jumping from the last column to the first column in the Burrows Wheeler matrix. In the standard FM-index algorithm they don’t compute them right at the time when you’re mapping each character from each read, because you’d have to compute the same thing multiple times, every time a read covers that region. Instead, what they do is precalculate all ranks for all letters and store them. Because this takes up quite a lot of memory, they actually store only a fraction of the rows and infer the gaps from the nearest stored value. I thought that’s what the sdsl guys are doing as well when I’m calling the rank function, they’re just accessing the structure that stores the ranks. But no, I just digged through their code and they’re actually calculating the rank on the spot. So I think it’s worth trying this out."
This from Sorina
"I just realised something that could potentially speed up the mapping. Those ranks you need for jumping from the last column to the first column in the Burrows Wheeler matrix. In the standard FM-index algorithm they don’t compute them right at the time when you’re mapping each character from each read, because you’d have to compute the same thing multiple times, every time a read covers that region. Instead, what they do is precalculate all ranks for all letters and store them. Because this takes up quite a lot of memory, they actually store only a fraction of the rows and infer the gaps from the nearest stored value. I thought that’s what the sdsl guys are doing as well when I’m calling the rank function, they’re just accessing the structure that stores the ranks. But no, I just digged through their code and they’re actually calculating the rank on the spot. So I think it’s worth trying this out."