aalto-speech / morfessor

Morfessor is a tool for unsupervised and semi-supervised morphological segmentation
http://morpho.aalto.fi
BSD 2-Clause "Simplified" License
185 stars 29 forks source link

add lru_cache to the segmentation commands #15

Closed svirpioj closed 5 years ago

svirpioj commented 5 years ago

Add LRU cache for Viterbi segmentations when segmenting corpora with an existing model. Direct support for python3, requires backports.functools_lru_cache for python2.7.

Time comparison on a 100k sentence Finnish corpus:

python2.7, no cache     0m51.573s
python2.7, cache        0m29.652s
python3.6, no cache     0m47.650s
python3.6, cache        0m23.818s
pypy, no cache          0m21.565s
pypy, cache             0m14.686s