infinilabs / analysis-ik

🚌 The IK Analysis plugin integrates Lucene IK analyzer into Elasticsearch and OpenSearch, support customized dictionary.
Apache License 2.0
16.48k stars 3.27k forks source link

关于分词,自定义词典以后还是会拆出单字? #1035

Open crossmaya opened 9 months ago

crossmaya commented 9 months ago

'analyzer' => 'ik_max_word', 'text' => '我爱迪丽热巴'

'爱迪', '热', '巴' 怎么能屏蔽掉呢?

Array ( [tokens] => Array ( [0] => Array ( [token] => 我爱 [start_offset] => 0 [end_offset] => 2 [type] => CN_WORD [position] => 0 )

        [1] => Array
            (
                [token] => 爱迪
                [start_offset] => 1
                [end_offset] => 3
                [type] => CN_WORD
                [position] => 1
            )

        [2] => Array
            (
                [token] => 迪丽热巴
                [start_offset] => 2
                [end_offset] => 6
                [type] => CN_WORD
                [position] => 2
            )

        [3] => Array
            (
                [token] => 热
                [start_offset] => 4
                [end_offset] => 5
                [type] => CN_WORD
                [position] => 3
            )

        [4] => Array
            (
                [token] => 巴
                [start_offset] => 5
                [end_offset] => 6
                [type] => CN_CHAR
                [position] => 4
            )

    )

)

Emptyrain commented 8 months ago

使用ik_smart应该就可以了吧