Open jackxwu opened 3 weeks ago
server log shows the following error. the missing pre-tokenizer type, using: 'default' error seems to be a know issue
llama_model_loader: - type f32: 65 tensors
llama_model_loader: - type q4_K: 193 tensors
llama_model_loader: - type q6_K: 33 tensors
llm_load_vocab: missing pre-tokenizer type, using: 'default'
llm_load_vocab:
llm_load_vocab: ****
llm_load_vocab: GENERATION QUALITY WILL BE DEGRADED!
llm_load_vocab: CONSIDER REGENERATING THE MODEL
llm_load_vocab: ****
llm_load_vocab:
llm_load_vocab: special tokens definition check successful ( 256/128256 ).
llm_load_print_meta: format = GGUF V3 (latest)
i run the server again by following the instructions on this page, https://github.com/OpenBMB/ollama/tree/minicpm-v2.5/examples/minicpm-v2.5. i downloaded models from huggingface.
the model performs better but still missed the original price $749.95
. notice the model hosted by huggingface was able to extract everything from the image, including $749.95. is this an expected behavior? that the model inference performance is better when the model runs on a GPU ?
extract product information from this image, provide product name, description and price
The product featured in the image is a "Nespresso Breville VesO850BS". It's described as a 4.2 star-rated item with 240 reviews. The current price for this product is listed as $699.95, but there's an ongoing discount of -7% off, which brings the price down to $699.95. This information suggests that it's likely a high-end coffee machine from the Nespresso brand.
extract the following information from this image, product title, current price, list price,
Product Title: Nespresso Breville VesO850BS Current Price: $699.95 List Price (before discount): $699.95
models Q4_K_M, Q8_0, and F16, all exhibit the same degraded performance problem.
my guess is that the problem indicated by the following error message, has caused the degraded performance
llm_load_vocab: missing pre-tokenizer type, using: 'default'
llm_load_vocab:
llm_load_vocab: ****
llm_load_vocab: GENERATION QUALITY WILL BE DEGRADED!
llm_load_vocab: CONSIDER REGENERATING THE MODEL
llm_load_vocab: ****
llm_load_vocab:
llm_load_vocab: special tokens definition check successful ( 256/128256 ).
is this issue related? https://huggingface.co/failspy/llama-3-70B-Instruct-abliterated-GGUF/discussions/2
What is the issue?
i built the ollama image on MacOS from source by following the instructions 3. Rebuild ./ollama binary file instruction , the build works, but the model is not able to extract information correctly.
result on MacOS
result on huggingface
OS
macOS
GPU
Apple
CPU
Apple
Ollama version
No response