seanjensengrey / adark

Automatically exported from code.google.com/p/adark
0 stars 0 forks source link

a4: Choose best suffix group to sort #12

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What functionality is missing?
Archon4 always sorts A<B>=C suffixes. Instead, it can choose either A>B<=C or 
A>B>=C together with A<B<=C. Any of these 3 groups is sufficient to deduce all 
other suffixes, so we can choose the smallest one to sort. While in average all 
of them are 1/3 of the input, the actual proportion can depend heavily on the 
nature of the data.

Why is it so important to have it?
It can reduce the number of suffixes required to sort without any performance 
penalty, just with a bit of more (localized) complexity.

What are the implementation stages:
1. Compute the theory behind each group.
2. Implement 3 different inducing paths.
3. Add logic to choose the best one.
4. Test to compare with a4r0.

Original issue reported on code.google.com by kvarkus on 6 Dec 2011 at 3:09

GoogleCodeExporter commented 8 years ago
It seems impossible to restore the order given 00 and 11 groups. And the nature 
of 01 and 10 groups is similar, hence there is next to no point in choosing one 
of them.

Original comment by kvarkus on 15 Dec 2011 at 3:16