Open renepickhardt opened 10 years ago
If I understand this correctly, these are just performance optimizations so we are doing neither at the time and in the future have to choose a dataformat if we want to optimize?
right we don't do that yet but I want to leave the issue open as this is an issue (enhancement)
Potential bachelor thesis of mine. How to index and compress (skipped) ngrams.
it might be interesting already in this toolkit to index the ngrams using FSTs or trieBased solutions. This is something that we should discuss since this seems like a rather big step but it would increase the performance, reduce the storage needs and also make it easier to create applications out of the box since fast querying will be possible.