Closed WayneCui closed 4 days ago
Processor doesn't have a tokenizer attribute and it doesn't use newline in the prompt.
Try this:
import mlx.core as mx
from mlx_vlm import load, generate
model_path = "mlx-community/deepseek-vl-7b-chat-4bit"
model, processor = load(model_path)
prompt = processor.apply_chat_template(
[{"role": "user", "content": f"<image>What are these?"}],
tokenize=False,
add_generation_prompt=True,
)
output = generate(model, processor, "http://images.cocodataset.org/val2017/000000039769.jpg", prompt, verbose=False)
print(output)
Let me know how it goes
Processor doesn't have a tokenizer attribute and it doesn't use newline in the prompt.
Try this:
import mlx.core as mx from mlx_vlm import load, generate model_path = "mlx-community/deepseek-vl-7b-chat-4bit" model, processor = load(model_path) prompt = processor.apply_chat_template( [{"role": "user", "content": f"<image>What are these?"}], tokenize=False, add_generation_prompt=True, ) output = generate(model, processor, "http://images.cocodataset.org/val2017/000000039769.jpg", prompt, verbose=False) print(output)
(deepseek) ➜ DeepSeek-VL git:(main) ✗ python inference2.py
Fetching 6 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 107088.61it/s]
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Traceback (most recent call last):
File "/Users/wayne/2-learning/Projects/gpt/DeepSeek-VL/inference2.py", line 13, in <module>
output = generate(model, processor, "http://images.cocodataset.org/val2017/000000039769.jpg", prompt, verbose=False)
File "/Users/wayne/anaconda3/envs/deepseek/lib/python3.9/site-packages/mlx_vlm/utils.py", line 830, in generate
prompt_tokens = mx.array(processor.tokenizer.encode(prompt))
AttributeError: 'LlamaTokenizerFast' object has no attribute 'tokenizer'
Thanks for your reply! Seems there is processor.tokenizer.encode
in mlx_vlm/utils.py
Hey @WayneCui
The image_preprocessor
object was missing, this should work fine:
import mlx.core as mx
from mlx_vlm.utils import load, generate, load_image_processor
model_path = "mlx-community/deepseek-vl-7b-chat-4bit"
model, processor = load(model_path)
image_processor = load_image_processor(model_path)
prompt = processor.apply_chat_template(
[{"role": "user", "content": f"<image>What are these?"}],
tokenize=False,
add_generation_prompt=True,
)
output = generate(
model,
processor,
"http://images.cocodataset.org/val2017/000000039769.jpg",
prompt,
image_processor,
verbose=False
)
print(output)
The docs are coming soon with examples for all models and how to guides.
It works for me, thanks a lot!
Most welcome;)
Traceback (most recent call last): File "/Users/wayne/2-learning/Projects/gpt/DeepSeek-VL/inference2.py", line 8, in
prompt = processor.tokenizer.apply_chat_template(
AttributeError: 'LlamaTokenizerFast' object has no attribute 'tokenizer'