OpenBMB / MiniCPM

MiniCPM-2B: An end-side LLM outperforming Llama2-13B.
Apache License 2.0
4.66k stars 334 forks source link

[Bug]: 模型编译报错 TVMError #47

Closed zjloong closed 6 months ago

zjloong commented 6 months ago

Is there an existing issue ? / 是否已有相关的 issue ?

Describe the bug / 描述这个 bug

[2024-02-05 15:25:06] INFO auto_config.py:115: Found model configuration: dist/models/MiniCPM-V/config.json
[2024-02-05 15:25:06] INFO auto_device.py:85: Not found device: cuda:0
[2024-02-05 15:25:07] INFO auto_device.py:85: Not found device: rocm:0
[2024-02-05 15:25:07] INFO auto_device.py:76: Found device: metal:0
[2024-02-05 15:25:07] INFO auto_device.py:85: Not found device: vulkan:0
[2024-02-05 15:25:08] INFO auto_device.py:85: Not found device: opencl:0
[2024-02-05 15:25:08] INFO auto_device.py:33: Using device: metal:0
[2024-02-05 15:25:08] INFO auto_weight.py:70: Finding weights in: dist/models/MiniCPM-V
[2024-02-05 15:25:08] INFO auto_weight.py:136: Not found Huggingface PyTorch
[2024-02-05 15:25:08] INFO auto_weight.py:143: Found source weight format: huggingface-safetensor. Source configuration: dist/models/MiniCPM-V/model.safetensors.index.json
[2024-02-05 15:25:08] INFO auto_weight.py:106: Using source weight configuration: dist/models/MiniCPM-V/model.safetensors.index.json. Use `--source` to override.
[2024-02-05 15:25:08] INFO auto_weight.py:110: Using source weight format: huggingface-safetensor. Use `--source-format` to override.
[2024-02-05 15:25:08] INFO auto_config.py:153: Found model type: minicpm_v. Use `--model-type` to override.
Weight conversion with arguments:
  --config          dist/models/MiniCPM-V/config.json
  --quantization    GroupQuantize(name='q4f16_1', kind='group-quant', group_size=32, quantize_dtype='int4', storage_dtype='uint32', model_dtype='float16', linear_weight_layout='NK', num_elem_per_storage=8, num_storage_per_group=4, max_int_value=7)
  --model-type      minicpm_v
  --device          metal:0
  --source          dist/models/MiniCPM-V/model.safetensors.index.json
  --source-format   huggingface-safetensor
  --output          dist/MiniCPM-V
[2024-02-05 15:25:08] INFO mistral_model.py:55: prefill_chunk_size defaults to sliding_window_size (4096)
Traceback (most recent call last):
  File "/Users/zhujinlong/miniconda3/envs/mini-cpm-env/bin/mlc_chat", line 33, in <module>
    sys.exit(load_entry_point('mlc-chat', 'console_scripts', 'mlc_chat')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zhujinlong/Develop/Project/LLM/mlc-MiniCPM/python/mlc_chat/__main__.py", line 28, in main
    cli.main(sys.argv[2:])
  File "/Users/zhujinlong/Develop/Project/LLM/mlc-MiniCPM/python/mlc_chat/cli/convert_weight.py", line 87, in main
    convert_weight(
  File "/Users/zhujinlong/Develop/Project/LLM/mlc-MiniCPM/python/mlc_chat/interface/convert_weight.py", line 156, in convert_weight
    _convert_args(args)
  File "/Users/zhujinlong/Develop/Project/LLM/mlc-MiniCPM/python/mlc_chat/interface/convert_weight.py", line 76, in _convert_args
    _, _named_params, _ = model.export_tvm(  # type: ignore[misc]
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zhujinlong/miniconda3/envs/mini-cpm-env/lib/python3.11/site-packages/tvm/relax/frontend/nn/core.py", line 479, in export_tvm
    mod, params, ext_mods = Exporter(debug=debug).build(spec)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zhujinlong/miniconda3/envs/mini-cpm-env/lib/python3.11/site-packages/tvm/relax/frontend/nn/exporter.py", line 136, in build
    outputs, inputs = _emit_method(self.builder, method_spec, params, effects)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zhujinlong/miniconda3/envs/mini-cpm-env/lib/python3.11/site-packages/tvm/relax/frontend/nn/exporter.py", line 277, in _emit_method
    outputs = spec.method(*explicit_inputs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zhujinlong/Develop/Project/LLM/mlc-MiniCPM/python/mlc_chat/model/mistral/mistral_model.py", line 700, in image
    inputs = self.vpm(inputs)
             ^^^^^^^^^^^^^^^^
  File "/Users/zhujinlong/miniconda3/envs/mini-cpm-env/lib/python3.11/site-packages/tvm/relax/frontend/nn/core.py", line 427, in __call__
    return self.forward(*args, **kwargs)  # pylint: disable=no-member
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zhujinlong/miniconda3/envs/mini-cpm-env/lib/python3.11/site-packages/tvm/relax/frontend/nn/subroutine.py", line 87, in new_forward
    return old_forward(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zhujinlong/Develop/Project/LLM/mlc-MiniCPM/python/mlc_chat/model/mistral/vit_model.py", line 195, in forward
    hidden_states = self.patch_embed(inputs)
                    ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zhujinlong/miniconda3/envs/mini-cpm-env/lib/python3.11/site-packages/tvm/relax/frontend/nn/core.py", line 427, in __call__
    return self.forward(*args, **kwargs)  # pylint: disable=no-member
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zhujinlong/miniconda3/envs/mini-cpm-env/lib/python3.11/site-packages/tvm/relax/frontend/nn/subroutine.py", line 87, in new_forward
    return old_forward(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zhujinlong/Develop/Project/LLM/mlc-MiniCPM/python/mlc_chat/model/mistral/vit_model.py", line 155, in forward
    embed = self.proj(inputs)
            ^^^^^^^^^^^^^^^^^
  File "/Users/zhujinlong/miniconda3/envs/mini-cpm-env/lib/python3.11/site-packages/tvm/relax/frontend/nn/core.py", line 427, in __call__
    return self.forward(*args, **kwargs)  # pylint: disable=no-member
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zhujinlong/miniconda3/envs/mini-cpm-env/lib/python3.11/site-packages/tvm/relax/frontend/nn/subroutine.py", line 87, in new_forward
    return old_forward(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zhujinlong/Develop/Project/LLM/mlc-MiniCPM/python/mlc_chat/model/mistral/vit_model.py", line 141, in forward
    x = x + self.bias.reshape([1, self.bias.shape[0], 1, 1])
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zhujinlong/miniconda3/envs/mini-cpm-env/lib/python3.11/site-packages/tvm/relax/frontend/nn/_tensor_op.py", line 82, in reshape
    return _op().reshape(self, shape)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zhujinlong/miniconda3/envs/mini-cpm-env/lib/python3.11/site-packages/tvm/relax/frontend/nn/op.py", line 644, in reshape
    return wrap_nested(_op.reshape(x._expr, shape), name)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/zhujinlong/miniconda3/envs/mini-cpm-env/lib/python3.11/site-packages/tvm/relax/op/manipulate.py", line 215, in reshape
    return _ffi_api.reshape(x, shape)  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "tvm/_ffi/_cython/./packed_func.pxi", line 332, in tvm._ffi._cy3.core.PackedFuncBase.__call__
  File "tvm/_ffi/_cython/./packed_func.pxi", line 263, in tvm._ffi._cy3.core.FuncCall
  File "tvm/_ffi/_cython/./packed_func.pxi", line 252, in tvm._ffi._cy3.core.FuncCall3
  File "tvm/_ffi/_cython/./base.pxi", line 182, in tvm._ffi._cy3.core.CHECK_CALL
  File "/Users/zhujinlong/miniconda3/envs/mini-cpm-env/lib/python3.11/site-packages/tvm/_ffi/base.py", line 481, in raise_last_ffi_error
    raise py_err
tvm._ffi.base.TVMError: Traceback (most recent call last):
  File "/Users/runner/work/package/package/tvm/src/relax/op/tensor/manipulate.cc", line 675
TVMError: Check failed: (_len != nullptr) is false: Reshape only expects the input new shape to be either an Expr or an Array of PrimExprs. However, the given new shape is [[1, 1152, 1, 1]]
[15:25:08] /Users/runner/work/package/package/tvm/src/relax/ir/block_builder.cc:65: Warning: BlockBuilder destroyed with remaining blocks!

To Reproduce / 如何复现

之前已经使用 mlc-llm 编译 Mistral-7B-Instruct-v0.2,部署到Android运行成功,依赖环境应该没什么问题。

MiniCPM部署过程如下:

conda create -n mini-cpm-env python==3.11
conda activate mini-cpm-env

# 如果不安装,执行 mlc_chat convert_weight 会报 No module named 'tvm'
python3 -m pip install --pre -U -f https://mlc.ai/wheels mlc-ai-nightly

git clone --recursive https://github.com/OpenBMB/mlc-MiniCPM.git
cd mlc-MiniCPM

mkdir -p build && cd build
python3 ../cmake/gen_cmake_config.py && cd ..
cd build && cmake .. && cmake --build . --parallel $(nproc) && cd ..
cd python && pip install -e . && cd ..

# 已经下载模型并放到 dist/models
mlc_chat convert_weight --model-type minicpm_v ./dist/models/MiniCPM-V/ --quantization q4f16_1 -o dist/MiniCPM-V/

报错信息如上所示。

Expected behavior / 期望的结果

No response

Screenshots / 截图

No response

Environment / 环境

- OS: [macOS 12.7.1]

Additional context / 其他信息

No response

Achazwl commented 6 months ago

看上去和 mlc-ai-nightly 版本有关,我本地里 https://github.com/mlc-ai/relax/blob/mlc/python/tvm/relax/frontend/nn/_tensor_op.py 这个接口对应的代码为 (我使用的版本为 0.12.dev1953)

    def reshape(self, shape):
        return _op().reshape(self, shape)

而新版已经改为

def reshape(self, *shape):
        return _op().reshape(self, shape)

这个修改发生在最近一个月中:https://github.com/mlc-ai/relax/commit/0b13b5c8445dfae5718af13faafef942d0a607fa