Closed stasoid closed 3 years ago
Wow, hello!
The main application for the constructed suffix array (back when I was working on it) was BWT. For its purposes, the rules of SA construction could be slightly bent in order to ease the construction algorithm.
One of such changes is assuming that there is a terminator after the input string, so instead of working on "abracadabra" the algorithm actually works on "abracadabra$", where $ - the terminator - could be either lower than all symbols or higher, based on what was more convenient.
Another trick that Archon4 (at least) did was considering the source to be reversed, so that we could do full dword
comparisons on the substrings. There is a REVERSED
define to control this.
If you combine these two, then the actual source being processed was: "$abracadabra", where $ is the lowest symbol, but the sub-strings going right to left instead of left-to-right. The lowest substring is a$
, then adacarba$
, then acarba$
, etc, which makes it "{1, 6, 8, 4, 11, 2, 9, 5, 7, 3, 10}" for Archon4.
The only difference with Archon7 is that this one considered "$" to be higher than all other symbols. So suffix "1" became the last of the "a..." substrings, and so forth.
Thank you for the detailed answer. I thought there exists simple transformation from p
to what I need, but as it turns out some nontrivial code modifications are required. So I better off using libdivsufsort then.
Hello, I am trying to use archon4 for computing suffix array, but after calling
compute()
variablep
does not hold correct suffix array. For example, for input "abracadabra" suffix array should be{10, 7, 0, 3, 5, 8, 1, 4, 6, 9, 2}
or{11, 8, 1, 4, 6, 9, 2, 5, 7, 10, 3}
with 1-based indexes, butp == {1, 6, 8, 4, 11, 2, 9, 5, 7, 3, 10}
. How can I obtain the correct suffix array?Also, the same question for archon7, which has
ar.P == {6, 8, 11, 4, 1, 9, 2, 5, 7, 10, 3}
.