Open jk2K opened 2 months ago
Hey folks - is there a solution for this? Does it mean I can't really use mistral.rs on a mac for Llama 3.2 vision?
@kinchahoy are you having this issue? I cannot reproduce it on my Mac - everything works.
Hey Eric - thanks for taking a look. I get:
when I run /examples/python/llama_vision.py with the following changes:
MODEL_ID = "EricB/Llama-3.2-11B-Vision-Instruct-UQFF"
and
which=Which.VisionPlain(
model_id=MODEL_ID,
arch=VisionArchitecture.VLlama,
from_uqff="llama3.2-vision-instruct-q4k.uqff"
),
❯ python llama_vision_v2.py 2024-10-28T19:33:51.245838Z INFO mistralrs_core::pipeline::vision: Loading
tokenizer.jsonat
EricB/Llama-3.2-11B-Vision-Instruct-UQFF 2024-10-28T19:33:51.246011Z INFO mistralrs_core::pipeline::vision: Loading
config.jsonat
EricB/Llama-3.2-11B-Vision-Instruct-UQFF 2024-10-28T19:33:51.543602Z INFO mistralrs_core::pipeline::paths: Found model weight filenames ["residual.safetensors"] 2024-10-28T19:33:51.684169Z INFO mistralrs_core::pipeline::vision: Loading
generation_config.jsonat
EricB/Llama-3.2-11B-Vision-Instruct-UQFF 2024-10-28T19:33:51.800356Z INFO mistralrs_core::pipeline::vision: Loading
preprocessor_config.jsonat
EricB/Llama-3.2-11B-Vision-Instruct-UQFF 2024-10-28T19:33:51.912937Z INFO mistralrs_core::pipeline::vision: Loading
tokenizer_config.jsonat
EricB/Llama-3.2-11B-Vision-Instruct-UQFF 2024-10-28T19:35:54.736120Z INFO mistralrs_core::pipeline::vision: Loading model
EricB/Llama-3.2-11B-Vision-Instruct-UQFFon metal[4294968663]. 2024-10-28T19:35:54.736198Z INFO mistralrs_core::pipeline::vision: Model config: MLlamaConfig { vision_config: MLlamaVisionConfig { hidden_size: 1280, hidden_act: Gelu, num_hidden_layers: 32, num_global_layers: 8, num_attention_heads: 16, num_channels: 3, intermediate_size: 5120, vision_output_dim: 7680, image_size: 560, patch_size: 14, norm_eps: 1e-5, max_num_tiles: 4, intermediate_layers_indices: [3, 7, 15, 23, 30], supported_aspect_ratios: [(1, 1), (1, 2), (1, 3), (1, 4), (2, 1), (2, 2), (3, 1), (4, 1)] }, text_config: MLlamaTextConfig { rope_scaling: Some(MLlamaRopeScaling { rope_type: Llama3, factor: Some(8.0), original_max_position_embeddings: 8192, attention_factor: None, beta_fast: None, beta_slow: None, short_factor: None, long_factor: None, low_freq_factor: Some(1.0), high_freq_factor: Some(4.0) }), vocab_size: 128256, hidden_size: 4096, hidden_act: Silu, num_hidden_layers: 40, num_attention_heads: 32, num_key_value_heads: 8, intermediate_size: 14336, rope_theta: 500000.0, rms_norm_eps: 1e-5, max_position_embeddings: 131072, tie_word_embeddings: false, cross_attention_layers: [3, 8, 13, 18, 23, 28, 33, 38], use_flash_attn: false, quantization_config: None } } 2024-10-28T19:35:54.745491Z INFO mistralrs_core::utils::normal: DType selected is F16. Traceback (most recent call last): File "/Users/raistlin/mistral.rs/examples/python/llama_vision_v2.py", line 7, in <module> runner = Runner( ^^^^^^^ ValueError: Metal error Error while loading function: "Function 'cast_bf16_f16' does not exist"
@kinchahoy could you please let me know what your hardware (chip, memory, etc) is?
OS: macOS Sequoia 15.1 arm64 Host: MacBook Air (M2, 2022) Kernel: Darwin 24.1.0 Display (Color LCD): 3420x2224 @ 60 Hz (as 1710x1112) in 14" [Built-in] CPU: Apple M2 (8) @ 3.50 GHz GPU: Apple M2 (10) @ 1.40 GHz [Integrated] Memory: 9.60 GiB / 16.00 GiB (60%) Swap: Disabled Disk (/): 255.10 GiB / 926.35 GiB (28%) - apfs [Read-only]
Thanks again for taking a look at this Eric!
Describe the bug
error message
Latest commit or version
5fcc9d6f8c0159feb3a237d07e8b3eb191dc6474