Closed daphnei closed 6 years ago
Thanks for reporting this, I've actually noticed this too for some queries. I'll see if there's anything I can do to speed up these queries and if not I'll add a 2 second timeout like you suggested.
Sorry it took so long, I did add something in Release 0.1.56 to control the time it takes by limiting the search. The timeout is not as easy to implement as I thought as SQLite has no in-built timeout for queries.
I also saw you mentioned memory-map caches weren't working for you in-between runs. They likely are. Memory maps are used for kNN-search and those are cached in between runs so you don't have to build them each time. (Takes > 1 min each run otherwise). What you are experiencing likely is some "queries" being a little slower on a new run, that is normal however. If you are experiencing extremely slow kNN-search on the first kNN-search of every run, then your memory map caches aren't working. These are cached to a $TMPDIR
on Linux/Mac. If you are using a Docker container or a VM solution, you might be losing this temporary directory every time. You can volume out that temp directory to a folder on your host directory if that is the case.
Also, I also don't seem to be getting the following advantage described in the documentation: "Moreover, memory maps are cached between runs so even after closing a process, speed improvements are reaped."
See the following log.
I understand that some queries (especially OOV ones) should be slower than others, but 36 seconds seems excessive. This issue doesn't affect all out-of-vocabulary words. For example:
Is there anything I can do to get all queries run within some reasonable threshold, say 2 seconds, or to get caching to work? Maybe there should be some feature where if OOV querying is taking too long, a random vector, like for the light model, is returned?