example grammar is failed

Describe the bug

I got error

directory: /Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/mistralrs/examples mymachine environment ``` ProductName: macOS ProductVersion: 14.4.1

Hardware Overview:

  Model Name: MacBook Pro
  Model Identifier: MacBookPro18,4
  Model Number: Z15H0016ZJ/A
  Chip: Apple M1 Max
  Total Number of Cores: 10 (8 performance and 2 efficiency)
  Memory: 64 GB

❯ cargo run --example grammar --release Compiling mistralrs-quant v0.3.0 (/Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/mistralrs-quant) Compiling mistralrs-core v0.3.0 (/Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/mistralrs-core) Compiling mistralrs-vision v0.3.0 (/Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/mistralrs-vision) Compiling mistralrs v0.3.0 (/Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/mistralrs) Finished release profile [optimized] target(s) in 30.75s Running /Users/yuta/ghq/github.com/EricLBuehler/mistral.rs/target/release/examples/grammar 2024-09-27T05:38:45.324420Z INFO hf_hub: Token file not found "/Users/yuta/.cache/huggingface/token"
2024-09-27T05:38:45.324590Z INFO mistralrs_core::utils::tokens: Could not load token at "/Users/yuta/.cache/huggingface/token", using no HF token. 2024-09-27T05:38:45.325083Z INFO mistralrs_core::pipeline::normal: Loading tokenizer.json at microsoft/Phi-3.5-mini-instruct 2024-09-27T05:38:45.325540Z INFO mistralrs_core::pipeline::normal: Loading config.json at microsoft/Phi-3.5-mini-instruct 2024-09-27T05:38:45.993416Z INFO mistralrs_core::pipeline::paths: Found model weight filenames ["model-00001-of-00002.safetensors", "model-00002-of-00002.safetensors"] 2024-09-27T05:38:46.198383Z INFO mistralrs_core::pipeline::normal: Loading generation_config.json at microsoft/Phi-3.5-mini-instruct 2024-09-27T05:38:46.933785Z INFO mistralrs_core::pipeline::normal: Loading tokenizer_config.json at microsoft/Phi-3.5-mini-instruct 2024-09-27T05:38:46.935057Z INFO mistralrs_core::pipeline::normal: Loading model microsoft/Phi-3.5-mini-instruct on cpu. 2024-09-27T05:38:46.935316Z INFO mistralrs_core::utils::log: Automatic loader type determined to be phi3 2024-09-27T05:38:46.935866Z INFO mistralrs_core::utils::normal: DType selected is F16. 2024-09-27T05:38:46.935898Z INFO mistralrs_core::pipeline::normal: Model config: Config { vocab_size: 32064, hidden_act: Silu, hidden_size: 3072, intermediate_size: 8192, num_hidden_layers: 32, num_attention_heads: 32, num_key_value_heads: 32, rms_norm_eps: 1e-5, rope_theta: 10000.0, bos_token_id: Some(1), eos_token_id: Some(32000), rope_scaling: Some(Classic { short_factor: [1.0, 1.0199999809265137, 1.0299999713897705, 1.0299999713897705, 1.0499999523162842, 1.0499999523162842, 1.0499999523162842, 1.0499999523162842, 1.0499999523162842, 1.069999933242798, 1.0999999046325684, 1.1099998950958252, 1.1599998474121094, 1.1599998474121094, 1.1699998378753662, 1.2899998426437378, 1.339999794960022, 1.679999828338623, 1.7899998426437378, 1.8199998140335083, 1.8499997854232788, 1.879999756813049, 1.90999972820282, 1.9399996995925903, 1.9899996519088743, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0199997425079346, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0299997329711914, 2.0799996852874756, 2.0899996757507324, 2.189999580383301, 2.2199995517730713, 2.5899994373321533, 2.729999542236328, 2.749999523162842, 2.8399994373321533], long_factor: [1.0800000429153442, 1.1100000143051147, 1.1399999856948853, 1.340000033378601, 1.5899999141693115, 1.600000023841858, 1.6200000047683716, 2.620000123977661, 3.2300000190734863, 3.2300000190734863, 4.789999961853027, 7.400000095367432, 7.700000286102295, 9.09000015258789, 12.199999809265137, 17.670000076293945, 24.46000099182129, 28.57000160217285, 30.420001983642575, 30.840002059936523, 32.590003967285156, 32.93000411987305, 42.32000350952149, 44.96000289916992, 50.34000396728515, 50.45000457763672, 57.55000305175781, 57.93000411987305, 58.21000289916992, 60.1400032043457, 62.61000442504883, 62.62000274658203, 62.71000289916992, 63.1400032043457, 63.1400032043457, 63.77000427246094, 63.93000411987305, 63.96000289916992, 63.970001220703125, 64.02999877929688, 64.06999969482422, 64.08000183105469, 64.12000274658203, 64.41000366210938, 64.4800033569336, 64.51000213623047, 64.52999877929688, 64.83999633789063], scaling_type: Su }), max_position_embeddings: 131072, use_flash_attn: false, sliding_window: Some(262144), original_max_position_embeddings: 4096, quantization_config: None } 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 67/67 [00:04<00:00, 11.71it/s] 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 128/128 [00:10<00:00, 7.58it/s] 2024-09-27T05:39:05.613379Z INFO mistralrs_core::pipeline::isq: Applying in-situ quantization into Some(Q4K) to 129 tensors. 2024-09-27T05:39:05.613615Z INFO mistralrs_core::pipeline::isq: Applying ISQ on 10 threads. 2024-09-27T05:39:12.267294Z INFO mistralrs_core::pipeline::isq: Applied in-situ quantization into Some(Q4K) to 129 tensors out of 129 total tensors. Took 6.65s 2024-09-27T05:39:12.311255Z INFO mistralrs_core::pipeline::chat_template: bos_toks = "~~", eos_toks = "<|endoftext|>", "<|end|>", "<|assistant|>", unk_tok = Error: shape mismatch in add, lhs: [32064], rhs: [32011]~~

~~## Latest commit or version 1eb9cae2a4ec89d7cf8a5fc8d9f57b82f2f747fa~~

EricLBuehler / mistral.rs

example grammar is failed #797

Describe the bug