kuromoji Search Results

702 results
for kuromoji

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

apache/lucene #4770

building a kuromoji dictionary is very slow and eventually f…

Note: This only affects you if you use java 5 on 3.x, and it only affects you if you want to download/rebuild the dictionary. the analyzer itself works fine on 3.x with java 5. With java 6, building…

asfimport updated 11 years ago
3
apache/lucene #4804

Improved Kuromoji search mode segmentation/decompounding [LU…

Kuromoji has a segmentation mode for search that uses a heuristic to promote additional segmentation of long candidate tokens to get a decompounding effect. This heuristic has been improved. Patch i…

asfimport updated 11 years ago
5
apache/lucene #4774

optionally support naist-jdic for kuromoji [LUCENE-3700]

This is an alternative dictionary, somewhat larger (\~25%). we can support it in build.xml so if a user wants to build with it, they can (the resulting jar file will be 500KB larger) --- Migrated …

asfimport updated 11 years ago
2
apache/lucene #4773

kuromoji dictionary could be more compact [LUCENE-3699]

Reading thru the ipadic documentation, i realized we are storing a lot of redundant information, for example the connection costs for bigram weights are based on POS+inflection data, so its redundant …

asfimport updated 11 years ago
13
apache/lucene #5025

validate depends on compile-tools, which does too much [LUCE…

lucene's common-build.xml 'validate' depends on compile-tools, but some modules like icu, kuromoji, etc have a compile-tools target (for other reasons). I think it should explicitly depend on common.…

asfimport updated 11 years ago
2
apache/lucene #4799

Add optional packing to FST building [LUCENE-3725]

The FSTs produced by Builder can be further shrunk if you are willing to spend highish transient RAM to do so... our Builder today tries hard not to use much RAM (and has options to tweak down the RAM…

asfimport updated 11 years ago
20
apache/lucene #4974

Add katakana stem filter to better deal with certain katakan…

Many Japanese katakana words end in a long sound that is sometimes optional. For example, パーティー and パーティ are both perfectly valid for "party". Similarly we have センター and センタ that are variants of "ce…

asfimport updated 11 years ago
7
apache/lucene #4874

Generify FST shortestPaths() to take a comparator [LUCENE-38…

Not sure we should do this, it costs 5-10% performance for WFSTSuggester. But maybe we can optimize something here, or maybe its just no big deal to us. Because in general, this could be pretty power…

asfimport updated 11 years ago
13
apache/lucene #5575

Make CompressingStoredFieldsFormat the new default StoredFie…

What would you think of making CompressingStoredFieldsFormat the new default StoredFieldsFormat? Stored fields compression has many benefits : - it makes the I/O cache work for us, - file-based index…

asfimport updated 11 years ago
27
apache/lucene #5013

When Japanese (Kuromoji) tokenizer removes a punctuation tok…

I modified BaseTokenStreamTestCase to assert that the start/end offsets match for graph (posLen > 1) tokens, and this caught a bug in Kuromoji when the decompounding of a compound token has a punctuat…

asfimport updated 12 years ago
15

上一页 1...65 66 67 68 69 70 71...71 下一页

702 results for kuromoji

702 results
for kuromoji