Open nicolasmaia opened 4 years ago
Actually, ignore my previous comment. We have a hard-coded limit for how long a string we look up. It's currently set at 13 characters.
Can we extend it?
Sure. We should probably do some analysis on the database to see what the longest entries are.
There will be some performance impact too since we'll do a number of more lookups than we need to, but hopefully it's acceptable? It would be nice if we could do some performance comparison though.
1 3193
2 58979
3 61481
4 75303
5 49891
6 39011
7 30063
8 21971
9 13759
10 8512
11 5391
12 3409
13 2072
14 1332
15 889
16 554
17 371
18 236
19 168
20 95
21 62
22 51
23 35
24 17
25 16
26 11
27 7
28 3
29 3
30 2
31 2
32 1
33 3
34 2
35 0
36 0
37 1
This is from the bundled dict.idx
. Didn't find anything longer than 37 full width characters.
The numbers include both kana-only and mixed kana-kanji entries.
Thanks! That's super useful.
I'm concerned that once I (finally) switch to storing the data in IndexedDB the extra lookup times for each substring are going produce a pretty significant performance impact.
Perhaps we could extend the limit to 16 for now? 19 at most?
Thanks! That's super useful.
I'm concerned that once I (finally) switch to storing the data in IndexedDB the extra lookup times for each substring are going produce a pretty significant performance impact.
Perhaps we could extend the limit to 16 for now? 19 at most?
Or maybe this should be adjustable in the options page?
Thanks! That's super useful. I'm concerned that once I (finally) switch to storing the data in IndexedDB the extra lookup times for each substring are going produce a pretty significant performance impact. Perhaps we could extend the limit to 16 for now? 19 at most?
Or maybe this should be adjustable in the options page?
This!
I've updated the hard-coded limit to 16 for now in: https://github.com/birtles/rikaichamp/commit/91846e16fc22c197f5c5188be595e8ef3304b5e8
I try really hard to avoid adding options so I'd prefer to wait until I can profile the performance difference properly and, if there's no noticeable impact, just increase the limit to 37.
For some reason, Rikai can't recognize the expression 先生と言われるほどの馬鹿でなし.
It's been in JMdict since last year, though: https://www.edrdg.org/jmdictdb/cgi-bin/entr.py?svc=jmdict&sid=&e=2070202