taishi-i / nagisa

A Japanese tokenizer based on recurrent neural networks
https://huggingface.co/spaces/taishi-i/nagisa-demo
MIT License
382 stars 22 forks source link

add cache layer to Tagger #3

Open bung87 opened 6 years ago

bung87 commented 6 years ago

if instantiating Tagger at function level it will load dictionary every time, if instantiating Tagger at module level it will load dictionary therefore may not actually use refer to https://github.com/fxsjy/jieba/blob/master/jieba/__init__.py

taishi-i commented 6 years ago

Thanks for the advice. I interpreted adding some lines tagger = Tagger() and functions (e.g, tagging = tagger.taggging) to __init__.py. , is it correct?

bung87 commented 6 years ago

you may add a singleton tagger instance to init.py as a shorthand method that use the default dictionary and implement a initialize method for actually make io happen and use a cache layer, when call tagger.tagging or some else methods for end developer interface,check if it is initialized,if not initialize it.

taishi-i commented 6 years ago

Regarding a cache layer, should I refer to methods written on lines 91 to 168 in https://github.com/fxsjy/jieba/blob/master/jieba/__init__.py and add them to the class Tagger?

bung87 commented 6 years ago

not sure about.that ,you.may consider use https://docs.python.org/3/library/functools.html#functools.lru_cache for keep code simple