OpenBMB / MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Apache License 2.0
12.52k stars 880 forks source link

[BUG] 在termux上运行minicpm出现ggml_can_repeat报错 #473

Open Cosmos2023 opened 2 months ago

Cosmos2023 commented 2 months ago

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

当我在我的手机上使用termux部署多模态的时候出现了一个bug,我的手机型号是iqoo12 RAM:12g 命令行为 ./minicpmv-cli -m ~/model/ggml-model-Q4_K_M.gguf --mmproj ~/model/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image ../../media/matmul.png -p "What is in the image?" 错误日志如下: `clip_model_load: - type f32: 285 tensors

clip_model_load: - type f16: 170 tensors

clip_model_load: CLIP using CPU backend

clip_model_load: text_encoder: 0

clip_model_load: vision_encoder: 1

clip_model_load: llava_projector: 1

clip_model_load: model size: 996.02 MB

clip_model_load: metadata size: 996.19 MB

clip_model_load: params backend buffer size = 996.02 MB (455 tensors)

key clip.vision.image_grid_pinpoints not found in file

key clip.vision.mm_patch_merge_type not found in file

key clip.vision.image_crop_resolution not found in file

clip_image_build_graph: ctx->buf_compute_meta.size(): 884880

clip_image_build_graph: load_image_size: 448 448

GGML_ASSERT: /data/data/com.termux/files/home/llama.cpp-minicpm-v2.5/ggml.c:4344: ggml_can_repeat(b, a)

Aborted`

同样的问题也出现在这条issue中 https://github.com/OpenBMB/MiniCPM-V/issues/382#issuecomment-2276981255 请问该如何解决呢,谢谢

期望行为 | Expected Behavior

No response

复现方法 | Steps To Reproduce

No response

运行环境 | Environment

$ ./minicpmv-cli --version
version: 0 (unknown)
built with Android (dev, +pgo, +bolt, +lto, -mlgo, based on r487747d) clang version 17.0.2 (https://android.googlesource.com/toolchain/llvm-project d9f89f4d16663d5012e5c09495f3b30ece3d2362) for aarch64-unknown-linux-android28

备注 | Anything else?

No response

HaishengLiang commented 2 months ago

me too

Log start clip_model_load: description: image encoder for MiniCPM-V clip_model_load: GGUF version: 3 clip_model_load: alignment: 32 clip_model_load: n_tensors: 455 clip_model_load: n_kv: 19 clip_model_load: ftype: f16

clip_model_load: loaded meta data with 19 key-value pairs and 455 tensors from /models/MiniCPM-V-2_6-gguf/mmproj-model-f16.gguf clip_model_load: Dumping metadata keys/values. Note: KV overrides do not apply in this output. clip_model_load: - kv 0: general.architecture str = clip clip_model_load: - kv 1: clip.has_text_encoder bool = false clip_model_load: - kv 2: clip.has_vision_encoder bool = true clip_model_load: - kv 3: clip.has_minicpmv_projector bool = true clip_model_load: - kv 4: general.file_type u32 = 1 clip_model_load: - kv 5: general.description str = image encoder for MiniCPM-V clip_model_load: - kv 6: clip.projector_type str = resampler clip_model_load: - kv 7: clip.minicpmv_version i32 = 3 clip_model_load: - kv 8: clip.vision.image_size u32 = 448 clip_model_load: - kv 9: clip.vision.patch_size u32 = 14 clip_model_load: - kv 10: clip.vision.embedding_length u32 = 1152 clip_model_load: - kv 11: clip.vision.feed_forward_length u32 = 4304 clip_model_load: - kv 12: clip.vision.projection_dim u32 = 0 clip_model_load: - kv 13: clip.vision.attention.head_count u32 = 16 clip_model_load: - kv 14: clip.vision.attention.layer_norm_epsilon f32 = 0.000001 clip_model_load: - kv 15: clip.vision.block_count u32 = 26 clip_model_load: - kv 16: clip.vision.image_mean arr[f32,3] = [0.500000, 0.500000, 0.500000] clip_model_load: - kv 17: clip.vision.image_std arr[f32,3] = [0.500000, 0.500000, 0.500000] clip_model_load: - kv 18: clip.use_gelu bool = true clip_model_load: - type f32: 285 tensors clip_model_load: - type f16: 170 tensors clip_model_load: CLIP using CPU backend clip_model_load: text_encoder: 0 clip_model_load: vision_encoder: 1 clip_model_load: llava_projector: 1 clip_model_load: model size: 996.02 MB clip_model_load: metadata size: 996.19 MB clip_model_load: params backend buffer size = 996.02 MB (455 tensors) key clip.vision.image_grid_pinpoints not found in file key clip.vision.mm_patch_merge_type not found in file key clip.vision.image_crop_resolution not found in file clip_image_build_graph: ctx->buf_compute_meta.size(): 884880 clip_image_build_graph: load_image_size: 448 448 GGML_ASSERT: ggml.c:4344: ggml_can_repeat(b, a) Aborted