Something wrong with the tokenize function.

The ggml model converted from "YeungNLP/bloomz-396m-zh" or "WangZeJun/bloom-396m-chat" lacks some tokens, such as the string "焙" or "擀", without corresponding tokens, the generated result cannot be displayed. However, in the official python way of the model, there is no such problem.

Sample, Notice the "�" section：

main: prompt: '面包的烘焙制作流程'
main: number of tokens in prompt = 3
 24765 -> '面包'
   373 -> '的'
 28967 -> '烘'

sampling parameters: temp = 0.800000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000

面包(24765)的(373)烘(28967)�(1165)�(237)技巧(16012)：(1038)
(189)1(20).(17) (210)面(1157)条(1996)要(853)煮熟(43916)，(355)否则(14458)容易(7305)粘(14494)。(420) 
(2813)2(21).(17) 应(23830)使用(2527)烤(15337)箱(8226)而不是(12285)微波(30656)炉(16613)加热(25228)面团(44449)。
(672)3(22).(17) 用(16647)冷水(33637)淋(15735)湿(10556)面团(44449)以防止(31473)黏(19639)在一起(10919)。
(672)4(23).(17) 在(3612)预(3119)热(4291)至(1546)摄氏(39868)175(13634)度(1423)时(1018)开始(3590)烘(28967)�(1165)�(237)，(355)直到(8326)底部(26609)变得
(13044)金(1539)黄色(21313)并(1437)散(4711)发出(13801)香味(32740)即可(10134)享用(42892)</s>(2) [end of text]

main: mem per token =  4944640 bytes
main:     load time =   558.57 ms
main:   sample time =   516.50 ms
main:  predict time =  3674.82 ms / 52.50 ms per token
main:    total time =  4945.50 ms

NouamaneTazi / bloomz.cpp

Something wrong with the tokenize function. #30