ggerganov / llama.cpp

LLM inference in C/C++
MIT License
64.96k stars 9.32k forks source link

llava-cli outputs gibberish #6944

Closed Any-Winter-4079 closed 4 months ago

Any-Winter-4079 commented 4 months ago

I am using an M1, on commit 928e0b70.

When I run

./llava-cli -m ./models/llava-v1.6-mistral-7b/ggml-mistral-7b-q_5_k.gguf --mmproj ./models/llava-v1.6-mistral-7b/mmproj-mistral7b-f16-q6_k.gguf -p 'Describe the image.' --image ./models/llava-v1.6-mistral-7b/S.jpg -c 4096

I get:

to to a,, new do s is and d t, in in is, r is to and,, h to a and is is t.. for s has m d,., a. is,, m he un, and b st a to to

. r. d t n is the to.' with, l the to., is is and t r and a t in, re h s d a is, is l in in as as is r in un,., to t t h a.. t on as a, st' > > the to a, t r is a the, d # is and the p r as t is, as, h has to do the in. in as c m. a is l the is to n st has on on r t, s' is new to, t a is, and/ he in g and/ is, is re, with at c is in' d,. at p c and/ is is.. n is h on a the and/,. is,- b the the on. at is is h do. and/ re, r in s, on for to to p b n,.. a as

I also tried with xtuner/llava-phi-3-mini with similar results.

./llava-cli -m ./models/llava-phi-3-mini/ggml-model-f16.gguf --mmproj ./models/llava-phi-3-mini/mmproj-model-f16.gguf -p 'Describe the image.' --image ./models/llava-v1.6-mistral-7b/S.jpg -c 4096

â }on }RESS, and thus,uminate, whereas, }}raint, but, }, }ioned, while, }}RESS, especially and hence, }ktion, }RESS,RESS,abeth,uminate,オ,RESS, }ività,RESS,ício,raint, }derr, }derr, and thus,RESS, }ución, }abeth, } }rache,RESS,abeth, }RESS, }RESS, }ionali,RESS, }}RESS, }eign, and thus,RESS, }abeth,RESS,ício,ioned,abeth,ício,derr,iry,RESS, }RESS, }RESS,otal,umm,RESS,RESS,オ, }RESS,annels, }iseconds, }} }}abeth,bose,abeth,ershell, }ktion, }RESS, }uminate,RESS, }idor, }RESS,ktion,abeth,ício,RESS,オ, }esses, perhapsRESS,abeth, }}abeth,RESS,RESS,RESS,uminate,derr,derr,bose,ività, makunivers, }}ício, }ioned,otal,ershell,abeth,RESS,RESS,RESS, } }`RESS,RESS,RESS,RESS,abeth,uminate,

Have there been any breaking changes to the llava-cli command? Can someone reproduce this issue as well?

Any-Winter-4079 commented 4 months ago

This commit does work 8a56075b07a8b571bf95a912ffdce4c928c2b414

The image presents a simple yet striking visual. Dominating the frame is a single, vertical line, painted in a vibrant shade of green. The line is slightly curved, adding a sense of dynamism to the composition. It stands out starkly against a solid blue background, creating a strong contrast that draws the viewer's attention. The line is positioned on the right side of the image, leaving the left side untouched and open. The image is devoid of any text or additional elements, making the green line the sole focus of this composition.

This one (next commit) makes it output gibberish f4dea7da1841a92d2788b0535063abf2f0e28461

, whereas,RESS, and thus, }, } }RESS, but, }, }}RESS, and thus,RESS,RESS, }}RESS, and thus, }}RESS, }}RESS, }}RESS, }} }}RESS, and thus,RESS, }}RESS, }} }RESS,RESS, } }RESS, } }}RESS, }RESS,RESS,RESS, } }RESS,RESS, }RESS,RESS, }} }RESS,RESS, } }RESS,RESS,RESS,RESS,RESS,RESS, } }}RESS, }}RESS, }}RESS, }} }}RESS,RESS, }} }RESS,RESS,RESS,RESS, }RESS,RESS,RESS,RESS, }} }}RESS,RESS,RESS,RESS,RESS,RESS, }}RESS,RESS,RESS, }RESS,RESS,RESS,RESS, }RESS,RESS,RESS, }} }RESS,RESS, } }} }} }}RESS,RESS,RESS, }RESS, }RESS,RESS, }RESS,RESS,RESS, } }} }} }}RESS,RESS,RESS,RESS,RESS,RESS,RESS,

Any-Winter-4079 commented 4 months ago

Alright, so for my use case, I'll stick to commit https://github.com/ggerganov/llama.cpp/commit/8a56075b07a8b571bf95a912ffdce4c928c2b414. If anyone needs more information about my Python version, packages, ... provided they can't reproduce it with arbitrary versions, I'll share them

Hope this helps someone!

turian commented 4 months ago

Possibly related to the tokenization issues discussed in #7056 #7062 #7049 #7006

ggerganov commented 4 months ago

I can't reproduce. This works on latest master with M2 Ultra (both CPU and GPU):

make -j && ./llava-cli -m ./models/llava-7b-v1.6/ggml-model-f16.gguf --mmproj ./models/llava-7b-v1.6/mmproj-model-f16.gguf --image ~/Downloads/cat.png -p "What is the animal doing?" --temp 0.0 -ngl 99

@turian Don't think this is related to tokenization changes, since LLaMA 1 and 2, Mistral and Phi-3 all use what we call SPM tokenizer - i.e. no pre-tokenization is done. The tokenization changes only affect models using BPE tokenizer

turian commented 4 months ago

@ggerganov Thank you for the explanation, but perhaps I need to open another bug report?

I have a colab notebook using TinyLlama-1.1b-1 (SPM according to llama.cpp output) showing that the llama-cpp-python tokenizer gives different output from the HF tokenizer. I am using TheBloke GGUF's which are quite old.

HF tokens + HF model = perplexity ~6
llama tokens + llama model = perplexity ~15
HF tokens + llama model = perplexity ~6
./perplexity.cpp = perplexity ~15

So I'm still seeing that with an old GGUF SPM model that the llama.cpp tokenization is different and bad.

Huggingface tokens: [1, 259, 13, 353, 4755, 350, 5059, 357, 353, 29871, 13, 29871, 13, 4755, 350, 5059, 357, 338, 385, 4223, 2706, 1919, 11456, 322, 24520, 11339, 869, 940, 750, 263, 17838, 732, 29899, 29992, 380, 23693, 6297, 373, 278, 11456, 3652, 450, 6682, 297, 29871, 29906, 29900, 29900, 29900, 869, 910, 471, 5643, 491, 263, 380, 23693, 6297, 297, 278, 1708, 2439, 787, 3971, 491, 11254, 27265, 575, 1919, 607, 471, 8560, 297, 29871, 29906, 29900, 29900, 29896, 472, 278, 7021, 9245, 15521, 869, 940, 750, 263, 17838, 6297, 297, 278, 11456, 3652, 26817, 2259, 897, 287, 297, 29871, 29906, 29900, 29900, 29906, 869, 512, 29871, 29906, 29900, 29900, 29946, 350, 5059, 357, 2982, 287, 263, 6297, 408, 376, 28050, 376, 297, 278, 12720, 376, 22040, 4518, 525, 29879, 13740, 376, 310, 278, 11456, 3652, 450, 6242, 383, 3568, 2056, 540, 5810, 1127, 19963, 29701, 4485, 3767, 549, 322, 360, 20400, 10968, 29875, 869, 940, 471, 4320, 297, 278, 29871, 29906, 29900, 29900, 29945, 24520, 1391, 1953, 310, 278, 14920, 390, 333, 2330, 1708, 29389, 29891, 7509, 1919, 607, 471, 8560, 472, 278, 4942, 398, 15521, 297, 349, 368, 21026, 322, 278, 7567, 631, 678, 542, 23167, 27561, 297, 4517, 869, 940, 471, 10624, 491, 2259, 323, 2593, 1384, 322, 5810, 1127, 19963, 4111, 806, 728, 1450, 1919, 1383, 1662, 796, 19924, 1919, 10686, 13272, 1919, 383, 3417, 261, 15846, 690, 1919, 19122, 347, 624, 11960, 322, 13298, 293, 6573, 869, 29871, 13, 512, 29871, 29906, 29900, 29900, 29953, 1919, 350, 5059, 357, 5810, 1127, 19963, 806, 728, 1450, 297, 278, 1708, 21353, 466, 575, 4034, 3971, 491, 4485, 390, 3496, 29131, 869, 940, 7470, 373, 263, 29871, 29906, 29900, 29900, 29953, 12720, 310, 278, 11456, 3652, 1919, 1938, 14359, 1919, 5643, 491, 263, 6297, 297, 278, 29871, 29906, 29900, 29900, 29955, 24520, 5802, 310, 1128, 304, 10837, 344, 10624, 491, 5875, 347, 390, 473, 446, 869, 1128, 304, 10837, 344, 471, 8560, 472, 24715, 15521, 297, 278, 4517, 6780, 820, 310, 26356, 414, 29885, 389, 322, 23004, 3391, 869, 350, 5059, 357, 5810, 1127, 297, 1023, 12298, 297, 29871, 29906, 29900, 29900, 29947, 1919, 8373, 4366, 6417, 495, 29891, 491, 2706, 28107, 3681, 951, 609, 29875, 1919, 322, 3872, 1989, 349, 3322, 10624, 491, 7137, 368, 6054, 18712, 869, 512, 2610, 29871, 29906, 29900, 29900, 29947, 1919, 350, 5059, 357, 1754, 263, 17838, 10097, 373, 263, 1023, 732, 29899, 29992, 760, 12720, 15232, 310, 278, 11456, 3652, 399, 5086, 278, 16992, 1919, 5643, 491, 385, 10097, 373, 278, 11456, 3652, 6298, 24759, 943, 297, 3979, 29871, 29906, 29900, 29900, 29947, 869, 940, 750, 263, 1162, 1038, 292, 6297, 297, 3006, 23238, 310, 278, 11456, 3652, 6960, 950, 1017, 297, 29871, 29906, 29900, 29896, 29900, 1919, 408, 376, 476, 10243, 383, 1026, 4630, 376, 869, 350, 5059, 357, 5810, 1127, 297, 278, 29871, 29906, 29900, 29896, 29896, 2706, 4702, 10278, 4314, 10624, 491, 3681, 951, 609, 29875, 869, 29871, 13, 29871, 13, 353, 353, 15825, 353, 353, 29871, 13, 29871, 13, 29871, 13, 353, 353, 353, 29871, 29906, 29900, 29900, 29900, 785]
Llama  GGUF tokens: [1, 259, 13, 353, 4755, 12476, 1896, 261, 353, 29871, 13, 29871, 13, 4755, 12476, 1896, 261, 338, 385, 4223, 2706, 1919, 11456, 322, 24520, 11339, 869, 940, 750, 263, 17838, 732, 29899, 29992, 5810, 5393, 6297, 373, 278, 11456, 3652, 450, 6682, 297, 29871, 29906, 29900, 29900, 29900, 869, 910, 471, 5643, 491, 263, 5810, 5393, 6297, 297, 278, 1708, 22167, 1983, 3971, 491, 11254, 14317, 29879, 1919, 607, 471, 8560, 297, 29871, 29906, 29900, 29900, 29896, 472, 278, 7021, 9245, 15521, 869, 940, 750, 263, 17838, 6297, 297, 278, 11456, 3652, 26817, 2259, 897, 287, 297, 29871, 29906, 29900, 29900, 29906, 869, 512, 29871, 29906, 29900, 29900, 29946, 12476, 1896, 261, 2982, 287, 263, 6297, 408, 376, 28050, 376, 297, 278, 12720, 376, 22040, 4518, 525, 29879, 13740, 376, 310, 278, 11456, 3652, 450, 6242, 14152, 29885, 2056, 540, 5810, 1127, 19963, 29701, 4485, 3767, 549, 322, 2452, 1416, 10968, 29875, 869, 940, 471, 4320, 297, 278, 29871, 29906, 29900, 29900, 29945, 24520, 5802, 29879, 310, 278, 14920, 21710, 11671, 1032, 1708, 29389, 29891, 7509, 1919, 607, 471, 8560, 472, 278, 16597, 29885, 15521, 297, 1858, 962, 2438, 322, 278, 7567, 631, 14542, 15519, 371, 27561, 297, 4517, 869, 940, 471, 10624, 491, 2259, 18439, 600, 1384, 322, 5810, 1127, 19963, 4111, 806, 728, 1450, 1919, 1383, 1662, 16753, 1362, 1919, 10686, 13272, 1919, 7347, 643, 15846, 690, 1919, 19122, 347, 7813, 880, 322, 13298, 293, 6573, 869, 29871, 13, 512, 29871, 29906, 29900, 29900, 29953, 1919, 12476, 1896, 261, 5810, 1127, 19963, 806, 728, 1450, 297, 278, 1708, 21353, 19642, 3527, 3971, 491, 4485, 28093, 264, 29131, 869, 940, 7470, 373, 263, 29871, 29906, 29900, 29900, 29953, 12720, 310, 278, 11456, 3652, 1919, 15460, 29879, 1919, 5643, 491, 263, 6297, 297, 278, 29871, 29906, 29900, 29900, 29955, 24520, 5802, 310, 1128, 304, 10837, 344, 10624, 491, 5875, 347, 15915, 29878, 446, 869, 1128, 304, 10837, 344, 471, 8560, 472, 24715, 15521, 297, 278, 4517, 6780, 820, 310, 26356, 414, 2415, 29882, 322, 23004, 3391, 869, 12476, 1896, 261, 5810, 1127, 297, 1023, 12298, 297, 29871, 29906, 29900, 29900, 29947, 1919, 8373, 4366, 6417, 495, 29891, 491, 2706, 28107, 3681, 10255, 2034, 1919, 322, 3872, 1989, 12129, 17608, 29882, 10624, 491, 7137, 368, 6054, 18712, 869, 512, 2610, 29871, 29906, 29900, 29900, 29947, 1919, 12476, 1896, 261, 1754, 263, 17838, 10097, 373, 263, 1023, 732, 29899, 29992, 760, 12720, 15232, 310, 278, 11456, 3652, 22552, 9292, 278, 16992, 1919, 5643, 491, 385, 10097, 373, 278, 11456, 3652, 6298, 24759, 943, 297, 3979, 29871, 29906, 29900, 29900, 29947, 869, 940, 750, 263, 1162, 1038, 292, 6297, 297, 3006, 23238, 310, 278, 11456, 3652, 6960, 950, 1017, 297, 29871, 29906, 29900, 29896, 29900, 1919, 408, 376, 16540, 1489, 29876, 13859, 14246, 2276, 376, 869, 12476, 1896, 261, 5810, 1127, 297, 278, 29871, 29906, 29900, 29896, 29896, 2706, 4702, 10278, 4314, 10624, 491, 3681, 10255, 2034, 869, 29871, 13, 29871, 13, 353, 353, 15825, 353, 353, 29871, 13, 29871, 13, 29871, 13, 353, 353, 353, 29871, 29906, 29900, 29900, 29900, 785, 29871, 29906]

Is this expected and why? I can share a minimal code example here or in a new issue.

Any-Winter-4079 commented 4 months ago

I can't reproduce. This works on latest master with M2 Ultra (both CPU and GPU):

make -j && ./llava-cli -m ./models/llava-7b-v1.6/ggml-model-f16.gguf --mmproj ./models/llava-7b-v1.6/mmproj-model-f16.gguf --image ~/Downloads/cat.png -p "What is the animal doing?" --temp 0.0 -ngl 99

@turian Don't think this is related to tokenization changes, since LLaMA 1 and 2, Mistral and Phi-3 all use what we call SPM tokenizer - i.e. no pre-tokenization is done. The tokenization changes only affect models using BPE tokenizer

The issue happened on the latest (https://github.com/ggerganov/llama.cpp/commit/928e0b7013c862cf10701957b3d654aa70f11bd8) commit that day. Can you try that one if you want?

, g m is d is,,' has r h and to, the a is the to un, and is c,' has do is with is l in is, in a is and has re, is at.. is, is he the b to > t to with. the on' d is at s is. with un and with at t t d un d s- l is st a., is for is to t to d. a with b has for is m d is t as the > is in in on to to as the a is to p s set new g is has is in he is d to the, pro to is is new t t for st for is., is, for. t. set b in l > st as s, the, has at t for # se and the in a is r, the t d is a,, l a has,. c re un and/ a r on in a., the a with for a, in,. for', a with,. for m a and/ d at n at > on is for is g do in on re,. re c r a un is-- is- new-- is, is s.. p

On today's latest commit, it works.

The image shows a pink X symbol. It's a simple, two-dimensional graphic with a solid color and a clear, bold outline. The X is centered in the image, with no additional context or objects present. The image is minimalistic, focusing solely on the X symbol. There's no text or other elements visible in the image.

And a previous commit (https://github.com/ggerganov/llama.cpp/commit/8a56075b07a8b571bf95a912ffdce4c928c2b414) had it working as well.

I guess the issue is solved now, although if the source was found it may be useful, to prevent reintroduction in the future. I'm pretty sure it was commit https://github.com/ggerganov/llama.cpp/commit/f4dea7da1841a92d2788b0535063abf2f0e28461 that introduced it.

Update: Actually, nevermind and apologies, it does work (even) on the referenced (https://github.com/ggerganov/llama.cpp/commit/928e0b7013c862cf10701957b3d654aa70f11bd8, https://github.com/ggerganov/llama.cpp/commit/f4dea7da1841a92d2788b0535063abf2f0e28461) commit(s). For some reason (arghhhh), I or some AI tool must have inadvertently removed make -j && from my code base, which messed up the execution after git pull'ing. Apologies 🫣 and closing the issue 😓 -and thank you for your work!

ggerganov commented 4 months ago

@turian You can open another issue, but I just verified that TinyLlama tokenization is correct using latest master.

Here are the steps:

diff --git a/convert-hf-to-gguf-update.py b/convert-hf-to-gguf-update.py
index a26f45a5..4fbec9f1 100755
--- a/convert-hf-to-gguf-update.py
+++ b/convert-hf-to-gguf-update.py
@@ -70,6 +70,7 @@ models = [
     {"name": "qwen2",          "tokt": TOKENIZER_TYPE.BPE, "repo": "https://huggingface.co/Qwen/Qwen1.5-7B", },
     {"name": "olmo",           "tokt": TOKENIZER_TYPE.BPE, "repo": "https://huggingface.co/allenai/OLMo-1.7-7B-hf", },
     {"name": "dbrx",           "tokt": TOKENIZER_TYPE.BPE, "repo": "https://huggingface.co/databricks/dbrx-base", },
+    {"name": "tinyllama",      "tokt": TOKENIZER_TYPE.SPM, "repo": "https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0", },
 ]

 # make directory "models/tokenizers" if it doesn't exist
python3 convert-hf-to-gguf-update.py <hf_token>

python3 convert-hf-to-gguf.py models/tokenizers/tinyllama/ --outfile models/ggml-vocab-tinyllama.gguf --vocab-only

make -j tests && ./tests/test-tokenizer-0 ./models/ggml-vocab-tinyllama.gguf
``` llama_model_loader: loaded meta data with 23 key-value pairs and 0 tensors from ./models/ggml-vocab-tinyllama.gguf (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = llama llama_model_loader: - kv 1: general.name str = tinyllama llama_model_loader: - kv 2: llama.block_count u32 = 22 llama_model_loader: - kv 3: llama.context_length u32 = 2048 llama_model_loader: - kv 4: llama.embedding_length u32 = 2048 llama_model_loader: - kv 5: llama.feed_forward_length u32 = 5632 llama_model_loader: - kv 6: llama.attention.head_count u32 = 32 llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 4 llama_model_loader: - kv 8: llama.rope.freq_base f32 = 10000.000000 llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010 llama_model_loader: - kv 10: general.file_type u32 = 1 llama_model_loader: - kv 11: llama.vocab_size u32 = 32000 llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 64 llama_model_loader: - kv 13: tokenizer.ggml.model str = llama llama_model_loader: - kv 14: tokenizer.ggml.pre str = default llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,32000] = ["", "", "", "<0x00>", "<... llama_model_loader: - kv 16: tokenizer.ggml.scores arr[f32,32000] = [0.000000, 0.000000, 0.000000, 0.0000... llama_model_loader: - kv 17: tokenizer.ggml.token_type arr[i32,32000] = [2, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6, ... llama_model_loader: - kv 18: tokenizer.ggml.bos_token_id u32 = 1 llama_model_loader: - kv 19: tokenizer.ggml.eos_token_id u32 = 2 llama_model_loader: - kv 20: tokenizer.ggml.unknown_token_id u32 = 0 llama_model_loader: - kv 21: tokenizer.ggml.padding_token_id u32 = 2 llama_model_loader: - kv 22: tokenizer.chat_template str = {% for message in messages %}\n{% if m... llm_load_vocab: special tokens definition check successful ( 259/32000 ). llm_load_print_meta: format = GGUF V3 (latest) llm_load_print_meta: arch = llama llm_load_print_meta: vocab type = SPM llm_load_print_meta: n_vocab = 32000 llm_load_print_meta: n_merges = 0 llm_load_print_meta: n_ctx_train = 2048 llm_load_print_meta: n_embd = 2048 llm_load_print_meta: n_head = 32 llm_load_print_meta: n_head_kv = 4 llm_load_print_meta: n_layer = 22 llm_load_print_meta: n_rot = 64 llm_load_print_meta: n_embd_head_k = 64 llm_load_print_meta: n_embd_head_v = 64 llm_load_print_meta: n_gqa = 8 llm_load_print_meta: n_embd_k_gqa = 256 llm_load_print_meta: n_embd_v_gqa = 256 llm_load_print_meta: f_norm_eps = 0.0e+00 llm_load_print_meta: f_norm_rms_eps = 1.0e-05 llm_load_print_meta: f_clamp_kqv = 0.0e+00 llm_load_print_meta: f_max_alibi_bias = 0.0e+00 llm_load_print_meta: f_logit_scale = 0.0e+00 llm_load_print_meta: n_ff = 5632 llm_load_print_meta: n_expert = 0 llm_load_print_meta: n_expert_used = 0 llm_load_print_meta: causal attn = 1 llm_load_print_meta: pooling type = 0 llm_load_print_meta: rope type = 0 llm_load_print_meta: rope scaling = linear llm_load_print_meta: freq_base_train = 10000.0 llm_load_print_meta: freq_scale_train = 1 llm_load_print_meta: n_yarn_orig_ctx = 2048 llm_load_print_meta: rope_finetuned = unknown llm_load_print_meta: ssm_d_conv = 0 llm_load_print_meta: ssm_d_inner = 0 llm_load_print_meta: ssm_d_state = 0 llm_load_print_meta: ssm_dt_rank = 0 llm_load_print_meta: model type = 1B llm_load_print_meta: model ftype = F16 llm_load_print_meta: model params = 0.00 K llm_load_print_meta: model size = 0.00 MiB (nan BPW) llm_load_print_meta: general.name = tinyllama llm_load_print_meta: BOS token = 1 '' llm_load_print_meta: EOS token = 2 '' llm_load_print_meta: UNK token = 0 '' llm_load_print_meta: PAD token = 2 '' llm_load_print_meta: LF token = 13 '<0x0A>' llama_model_load: vocab only - skipping tensors llama_new_context_with_model: n_ctx = 512 llama_new_context_with_model: n_batch = 512 llama_new_context_with_model: n_ubatch = 512 llama_new_context_with_model: flash_attn = 0 llama_new_context_with_model: freq_base = 10000.0 llama_new_context_with_model: freq_scale = 1 src: '' res: '' tok: src: ' ' res: ' ' tok: 29871 12 src: ' ' res: ' ' tok: 29871 12 13 src: ' ' res: ' ' tok: 29871 13 src: ' ' res: ' ' tok: 29871 13 13 src: ' ' res: ' ' tok: 29871 13 13 13 src: ' 🚀 (normal) 😶‍🌫️ (multiple emojis concatenated) ✅ 🦙🦙 3 33 333 3333 33333 333333 3333333 33333333 3.3 3..3 3...3 កាន់តែពិសេសអាច😁 ?我想在apple工作1314151天~ ------======= нещо на Български ''''''```````""""......!!!!!!?????? I've been 'told he's there, 'RE you sure? 'M not sure I'll make it, 'D you like some tea? We'Ve a'lL' res: ' 🚀 (normal) 😶‍🌫️ (multiple emojis concatenated) ✅ 🦙🦙 3 33 333 3333 33333 333333 3333333 33333333 3.3 3..3 3...3 កាន់តែពិសេសអាច😁 ?我想在apple工作1314151天~ ------======= нещо на Български ''''''```````""""......!!!!!!?????? I've been 'told he's there, 'RE you sure? 'M not sure I'll make it, 'D you like some tea? We'Ve a'lL' tok: 29871 13 29871 13 13 29871 13 13 13 29871 12 29871 12 12 29871 12 13 259 13 1678 13 268 13 418 13 243 162 157 131 313 8945 29897 29871 243 162 155 185 30722 243 162 143 174 30598 313 20787 953 3848 275 16125 630 29897 29871 31681 29871 243 162 169 156 243 162 169 156 29871 29941 29871 29941 29941 29871 29941 29941 29941 29871 29941 29941 29941 29941 29871 29941 29941 29941 29941 29941 29871 29941 29941 29941 29941 29941 29941 29871 29941 29941 29941 29941 29941 29941 29941 29871 29941 29941 29941 29941 29941 29941 29941 29941 29871 29941 29889 29941 29871 29941 636 29941 29871 29941 856 29941 29871 31849 31324 31934 228 162 142 228 161 146 228 162 133 228 161 153 228 161 186 31708 228 162 132 31708 228 161 165 31324 228 161 136 243 162 155 132 1577 30672 31522 30505 11548 31041 30732 29896 29941 29896 29946 29896 29945 29896 30408 30739 448 23648 2751 25512 1538 4851 665 1386 29713 1305 14550 4907 11120 16159 16159 16159 15945 15945 3045 636 6824 6824 6824 8773 8773 8773 306 29915 345 1063 525 29873 1025 540 29915 29879 727 29892 525 1525 366 1854 29973 525 29924 451 1854 306 29915 645 1207 372 29892 525 29928 366 763 777 23429 29973 1334 29915 29963 29872 263 29915 29880 29931 src: ' =' res: ' =' tok: 29871 13 353 src: ' ' res: ' ' tok: 259 src: ' ' res: ' ' tok: 1678 src: ' ' res: ' ' tok: 268 src: ' Hello' res: ' Hello' tok: 268 15043 src: ' Hello Hello' res: ' Hello Hello' tok: 268 15043 13 1678 15043 src: ' Hello' res: ' Hello' tok: 1678 15043 src: ' Hello' res: ' Hello' tok: 259 15043 src: ' (' res: ' (' tok: 29871 313 src: ' Hello' res: ' Hello' tok: 29871 15043 src: ' Hello World' res: ' Hello World' tok: 29871 15043 2787 src: ' Hello World!' res: ' Hello World!' tok: 29871 15043 2787 29991 src: ' Hello world' res: ' Hello world' tok: 29871 15043 3186 src: ' Hello, world!' res: ' Hello, world!' tok: 29871 15043 29892 3186 29991 src: ' this is 🦙.cpp' res: ' this is 🦙.cpp' tok: 29871 445 338 29871 243 162 169 156 29889 8223 src: '' era' res: ' ' era' tok: 525 3152 src: '3' res: ' 3' tok: 29871 29941 src: '33' res: ' 33' tok: 29871 29941 29941 src: '333' res: ' 333' tok: 29871 29941 29941 29941 src: '3333' res: ' 3333' tok: 29871 29941 29941 29941 29941 src: '33333' res: ' 33333' tok: 29871 29941 29941 29941 29941 29941 src: '333333' res: ' 333333' tok: 29871 29941 29941 29941 29941 29941 29941 src: '3333333' res: ' 3333333' tok: 29871 29941 29941 29941 29941 29941 29941 29941 src: '33333333' res: ' 33333333' tok: 29871 29941 29941 29941 29941 29941 29941 29941 29941 src: '333333333' res: ' 333333333' tok: 29871 29941 29941 29941 29941 29941 29941 29941 29941 29941 src: 'Führer' res: ' Führer' tok: 383 4000 261 src: 'Hello' res: ' Hello' tok: 15043 src: 'Hello World' res: ' Hello World' tok: 15043 2787 src: 'Hello world' res: ' Hello world' tok: 15043 3186 src: 'Hello, world!' res: ' Hello, world!' tok: 15043 29892 3186 29991 src: 'Hello, y'all! How are you 😁 ?我想在apple工作1314151天~' res: ' Hello, y'all! How are you 😁 ?我想在apple工作1314151天~' tok: 15043 29892 343 29915 497 29991 1128 526 366 29871 243 162 155 132 1577 30672 31522 30505 11548 31041 30732 29896 29941 29896 29946 29896 29945 29896 30408 30739 src: 'ied 4 ½ months' res: ' ied 4 ½ months' tok: 474 287 29871 29946 29871 30226 7378 src: 'w048 7tuijk dsdfhu' res: ' w048 7tuijk dsdfhu' tok: 281 29900 29946 29947 29871 29955 9161 13535 18031 2176 6905 src: 'нещо на Български' res: ' нещо на Български' tok: 1538 4851 665 1386 29713 1305 src: 'កាន់តែពិសេសអាចខលចេញ' res: ' កាន់តែពិសេសអាចខលចេញ' tok: 29871 31849 31324 31934 228 162 142 228 161 146 228 162 133 228 161 153 228 161 186 31708 228 162 132 31708 228 161 165 31324 228 161 136 228 161 132 228 161 158 228 161 136 228 162 132 228 161 140 src: '🚀 (normal) 😶‍🌫️ (multiple emojis concatenated) ✅ (only emoji that has its own token)' res: ' 🚀 (normal) 😶‍🌫️ (multiple emojis concatenated) ✅ (only emoji that has its own token)' tok: 29871 243 162 157 131 313 8945 29897 29871 243 162 155 185 30722 243 162 143 174 30598 313 20787 953 3848 275 16125 630 29897 29871 31681 313 6194 953 29877 2397 393 756 967 1914 5993 29897 Tests passed ```
turian commented 4 months ago

@ggerganov My issue is that an OLD previously converted TinyLlama GGUF a) has buggy tokenization b) doesn't provide any warning.

Is there any workaround for getting old GGUF files to work, rather than creating new GGUF files from HF?

teleprint-me commented 4 months ago

Maybe a version bump every time models break? Could do major.minor.revision and update the minor most of time? Update the major for breaking?