During development of searching for client device by partial serial number we came up with TextIndex which will provide a set of indexes, the all combinations of subwords for a given set of words.
This is simple comparison between the two approaches (TextIndex and IndexWriter).
You can use it if you find it useful.
Test word : mississippi
public class TextIndex {
private final Set<String> words;
public TextIndex(String... words) {
this.words = newLinkedHashSet(newArrayList(words));
}
public List<String> generate() {
Set<String> index = newLinkedHashSet();
for (String word : words) {
word = word.toUpperCase();
for (int i = 0; i < word.length(); i++) {
for (int j = i + 1; j < word.length(); j++) {
index.add(word.substring(i, j));
}
index.add(word.substring(i));
}
}
return newArrayList(index);
}
}
Generated Indexes: 53
M
MI
MIS
MISS
MISSI
MISSIS
MISSISS
MISSISSI
MISSISSIP
MISSISSIPP
MISSISSIPPI
I
IS
ISS
ISSI
ISSIS
ISSISS
ISSISSI
ISSISSIP
ISSISSIPP
ISSISSIPPI
S
SS
SSI
SSIS
SSISS
SSISSI
SSISSIP
SSISSIPP
SSISSIPPI
SI
SIS
SISS
SISSI
SISSIP
SISSIPP
SISSIPPI
ISSIP
ISSIPP
ISSIPPI
SSIP
SSIPP
SSIPPI
SIP
SIPP
SIPPI
IP
IPP
IPPI
P
PP
PPI
PI
IndexWriter
Generated Indexes: 25
ssissippi
issippi
sissippi
mis
ssissip
iss
sippi
mississ
m
ippi
ississipp
mississipp
mississippi
i
ppi
missi
mississip
missis
mississi
ssippi
sissi
ississippi
pi
mi
miss
During development of searching for client device by partial serial number we came up with
TextIndex
which will provide a set of indexes, the all combinations of subwords for a given set of words. This is simple comparison between the two approaches (TextIndex
andIndexWriter
). You can use it if you find it useful.Test word :
mississippi
Generated Indexes: 53
IndexWriter
Generated Indexes: 25