InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
https://lmdeploy.readthedocs.io/en/latest/
Apache License 2.0
4.33k stars 390 forks source link

Error When loading 'openbmb/MiniCPM-Llama3-V-2_5' #1771

Open Fahmie23 opened 3 months ago

Fahmie23 commented 3 months ago

Checklist

Describe the bug

I try loading and make inference using the sample code given in the github examples directory. The model that i want to load is MiniCPM-Llama3-V-2_5. Could someone help me.

Reproduction

from lmdeploy import pipeline from lmdeploy.vl import load_image

pipe = pipeline('openbmb/MiniCPM-Llama3-V-2_5')

image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg') response = pipe(('describe this image', image)) print(response)

Environment

sys.platform: linux
Python: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
CUDA available: True
MUSA available: False
numpy_random_seed: 2147483648
GPU 0,1: NVIDIA A100 80GB PCIe
CUDA_HOME: /usr/local/cuda-12.2
NVCC: Cuda compilation tools, release 12.2, V12.2.140
GCC: x86_64-linux-gnu-gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
PyTorch: 2.2.2+cu121
PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201703
  - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v3.3.2 (Git Hash 2dc95a2ad0841e29db8b22fbccaf3e5da7992b01)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 12.1
  - NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
  - CuDNN 8.9.2
  - Magma 2.6.1
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.2.2, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, 

TorchVision: 0.17.2+cu121
LMDeploy: 0.4.2+
transformers: 4.41.2
gradio: Not Found
fastapi: 0.111.0
pydantic: 2.7.3
triton: 2.2.0

Error traceback

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[1], line 7
      4 pipe = pipeline('openbmb/MiniCPM-Llama3-V-2_5')
      6 image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
----> 7 response = pipe(('describe this image', image))
      8 print(response)

File ~/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/serve/async_engine.py:326, in AsyncEngine.__call__(self, prompts, gen_config, request_output_len, top_k, top_p, temperature, repetition_penalty, ignore_eos, do_preprocess, adapter_name, use_tqdm, **kwargs)
    318 if gen_config is None:
    319     gen_config = GenerationConfig(
    320         max_new_tokens=request_output_len,
    321         top_k=top_k,
   (...)
    324         repetition_penalty=repetition_penalty,
    325         ignore_eos=ignore_eos)
--> 326 return self.batch_infer(prompts,
    327                         gen_config=gen_config,
    328                         do_preprocess=do_preprocess,
    329                         adapter_name=adapter_name,
    330                         use_tqdm=use_tqdm,
    331                         **kwargs)

File ~/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/serve/async_engine.py:402, in AsyncEngine.batch_infer(self, prompts, gen_config, do_preprocess, adapter_name, use_tqdm, **kwargs)
    399 need_list_wrap = isinstance(prompts, str) or isinstance(
...
--> 402 assert isinstance(prompts, List), 'prompts should be a list'
    403 if gen_config is None:
    404     gen_config = GenerationConfig()

AssertionError: prompts should be a list
irexyc commented 3 months ago

How did you install LMDeploy? The version 0.4.2 doesn't support openbmb/MiniCPM-Llama3-V-2_5, you should build the latest code or copy the latest code to the installation folder.

Fahmie23 commented 3 months ago

How did you install LMDeploy? The version 0.4.2 doesn't support openbmb/MiniCPM-Llama3-V-2_5, you should build the latest code or copy the latest code to the installation folder.

i install using pip install lmdeploy.

Fahmie23 commented 3 months ago

which version do i need to install. I thought version 0.4.2 is the latest version

Fahmie23 commented 3 months ago

2024-06-13 07:19:06,357 - lmdeploy - ERROR - Engine loop failed with error: MiniCPMV.forward() missing 1 required positional argument: 'data' Traceback (most recent call last): File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/request.py", line 17, in _raise_exception_on_finish task.result() File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result raise self._exception.with_traceback(self._exception_tb) File "/usr/lib/python3.10/asyncio/tasks.py", line 232, in step result = coro.send(None) File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 824, in async_loop await step(True) File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 810, in step raise e File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 802, in step raise out File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 751, in _async_loop_background await self._async_step_background( File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 659, in _async_step_background output = await self._async_model_forward(inputs, File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/utils.py", line 250, in tmp return (await func(*args, **kwargs)) File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 576, in _async_model_forward return await forward(inputs) File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 554, in __forward return await self.model_agent.async_forward( File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/model_agent.py", line 735, in async_forward ... return self._call_impl(*args, *kwargs) File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(args, **kwargs) TypeError: MiniCPMV.forward() missing 1 required positional argument: 'data' Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings... /home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/pygments/regexopt.py:77: RuntimeWarning: coroutine 'AsyncEngine.batch_infer..gather' was never awaited '|'.join(regex_opt_inner(list(group[1]), '') RuntimeWarning: Enable tracemalloc to get the object allocation traceback

CancelledError Traceback (most recent call last) File /usr/lib/python3.10/asyncio/tasks.py:234, in Task.__step(failed resolving arguments) 233 else: --> 234 result = coro.throw(exc) 235 except StopIteration as exc:

File /usr/lib/python3.10/asyncio/queues.py:159, in Queue.get(self) 158 try: --> 159 await getter 160 except:

File /usr/lib/python3.10/asyncio/futures.py:285, in Future.await(self) 284 self._asyncio_future_blocking = True --> 285 yield self # This tells Task to wait for completion. 286 if not self.done():

File /usr/lib/python3.10/asyncio/tasks.py:304, in Task.__wakeup(self, future) 303 try: --> 304 future.result() 305 except BaseException as exc: 306 # This may also be a cancellation.

File /usr/lib/python3.10/asyncio/futures.py:196, in Future.result(self) 195 exc = self._make_cancelled_error() ... --> 178 exit(1) 179 continue 180 except Exception as e:

NameError: name 'exit' is not defined

irexyc commented 3 months ago

0.4.2 is the latest version published, but openbmb/MiniCPM-Llama3-V-2_5 is added after that release.

To use openbmb/MiniCPM-Llama3-V-2_5, you could use one of the following method.

Fahmie23 commented 3 months ago

i already git clone the repo. After that i got new error. Below is the error:

2024-06-13 07:19:06,357 - lmdeploy - ERROR - Engine loop failed with error: MiniCPMV.forward() missing 1 required positional argument: 'data' Traceback (most recent call last): File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/request.py", line 17, in _raise_exception_on_finish task.result() File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result raise self._exception.with_traceback(self._exception_tb) File "/usr/lib/python3.10/asyncio/tasks.py", line 232, in step result = coro.send(None) File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 824, in async_loop await step(True) File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 810, in step raise e File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 802, in step raise out File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 751, in _async_loop_background await self._async_step_background( File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 659, in _async_step_background output = await self._async_model_forward(inputs, File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/utils.py", line 250, in tmp return (await func(*args, **kwargs)) File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 576, in _async_model_forward return await forward(inputs) File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 554, in forward return await self.model_agent.async_forward( File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/model_agent.py", line 735, in async_forward ... return self._call_impl(*args, *kwargs) File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(args, **kwargs) TypeError: MiniCPMV.forward() missing 1 required positional argument: 'data' Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings... /home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/pygments/regexopt.py:77: RuntimeWarning: coroutine 'AsyncEngine.batch_infer..gather' was never awaited '|'.join(regex_opt_inner(list(group[1]), '') RuntimeWarning: Enable tracemalloc to get the object allocation traceback CancelledError Traceback (most recent call last) File /usr/lib/python3.10/asyncio/tasks.py:234, in Task.step(failed resolving arguments) 233 else: --> 234 result = coro.throw(exc) 235 except StopIteration as exc:

File /usr/lib/python3.10/asyncio/queues.py:159, in Queue.get(self) 158 try: --> 159 await getter 160 except:

File /usr/lib/python3.10/asyncio/futures.py:285, in Future.await(self) 284 self._asyncio_future_blocking = True --> 285 yield self # This tells Task to wait for completion. 286 if not self.done():

File /usr/lib/python3.10/asyncio/tasks.py:304, in Task.__wakeup(self, future) 303 try: --> 304 future.result() 305 except BaseException as exc: 306 # This may also be a cancellation.

File /usr/lib/python3.10/asyncio/futures.py:196, in Future.result(self) 195 exc = self._make_cancelled_error() ... --> 178 exit(1) 179 continue 180 except Exception as e:

NameError: name 'exit' is not defined

irexyc commented 3 months ago

Clone repo is not enougth.

You should either build LMDeploy yourself (the first two methods) or copy the latest code to the installation folder (/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy).

Fahmie23 commented 3 months ago

i already follow your step. After that i got new error. Can you help me. Below is the error:

2024-06-13 07:19:06,357 - lmdeploy - ERROR - Engine loop failed with error: MiniCPMV.forward() missing 1 required positional argument: 'data' Traceback (most recent call last): File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/request.py", line 17, in _raise_exception_on_finish task.result() File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result raise self._exception.with_traceback(self._exception_tb) File "/usr/lib/python3.10/asyncio/tasks.py", line 232, in step result = coro.send(None) File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 824, in async_loop await step(True) File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 810, in step raise e File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 802, in step raise out File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 751, in _async_loop_background await self._async_step_background( File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 659, in _async_step_background output = await self._async_model_forward(inputs, File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/utils.py", line 250, in tmp return (await func(*args, **kwargs)) File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 576, in _async_model_forward return await forward(inputs) File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 554, in forward return await self.model_agent.async_forward( File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/lmdeploy/pytorch/engine/model_agent.py", line 735, in async_forward ... return self._call_impl(*args, *kwargs) File "/home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(args, **kwargs) TypeError: MiniCPMV.forward() missing 1 required positional argument: 'data' Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings... /home/mcmcssuserdata/fahmie/lm_deploy/lm_deploy/lib/python3.10/site-packages/pygments/regexopt.py:77: RuntimeWarning: coroutine 'AsyncEngine.batch_infer..gather' was never awaited '|'.join(regex_opt_inner(list(group[1]), '') RuntimeWarning: Enable tracemalloc to get the object allocation traceback CancelledError Traceback (most recent call last) File /usr/lib/python3.10/asyncio/tasks.py:234, in Task.step(failed resolving arguments) 233 else: --> 234 result = coro.throw(exc) 235 except StopIteration as exc:

File /usr/lib/python3.10/asyncio/queues.py:159, in Queue.get(self) 158 try: --> 159 await getter 160 except:

File /usr/lib/python3.10/asyncio/futures.py:285, in Future.await(self) 284 self._asyncio_future_blocking = True --> 285 yield self # This tells Task to wait for completion. 286 if not self.done():

File /usr/lib/python3.10/asyncio/tasks.py:304, in Task.__wakeup(self, future) 303 try: --> 304 future.result() 305 except BaseException as exc: 306 # This may also be a cancellation.

File /usr/lib/python3.10/asyncio/futures.py:196, in Future.result(self) 195 exc = self._make_cancelled_error() ... --> 178 exit(1) 179 continue 180 except Exception as e:

NameError: name 'exit' is not defined

QwertyJack commented 3 months ago

Seems that you are using the pytorch backend. However, with the latest code from the main branch, pipeline can select the correct backend:

>>> from lmdeploy.turbomind.supported_models import is_supported
>>> is_supported('/data/models/MiniCPM-Llama3-V-2_5')
True
irexyc commented 3 months ago

pipe = pipeline('openbmb/MiniCPM-Llama3-V-2_5')

@Fahmie23 If you replaced all files, the above code will use turobmind backend to init the pipeline, but according to your log, there are lines like lmdeploy/pytorch/engine/engine.py saying that you are actually using pytorch backend.

Fahmie23 commented 3 months ago

can i know how can i get the latest code?

irexyc commented 3 months ago

Git clone will clone the latest code.

But I have no way to know whether your correctly replaced the installation files. The log information is not right as pytorch backend should not be used.

So I suggest you to build LMDeploy with github action and install the built wheel package.

AnyangAngus commented 3 months ago

@irexyc I run the MINICPMV-2,5 infer in lmdeploy successfully, Nice Work! one question is that the model forward in lmdeploy is based on Fastertransfomer or pytorch ?

since there is a bug in the batch infer in Huggingface modeling_minicpmv.py [https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5/discussions/29], so dose lmdeploy supports a correct batch infer? Thank U! ^_^

irexyc commented 3 months ago

@AnyangAngus

Currently, the LLM part of INICPMV-2.5 only support turbomind backend which is developed based on Fastertransfomer. But in the last few versions, almost all kernels except for sampling have been rewritten. And we support batch inference.

AnyangAngus commented 3 months ago

@AnyangAngus

Currently, the LLM part of INICPMV-2.5 only support turbomind backend which is developed based on Fastertransfomer. But in the last few versions, almost all kernels except for sampling have been rewritten. And we support batch inference.

@irexyc Cool!

Another question is that how can i pass the infer parameters to lmdeploy in Minicpmv-2.5 according to its huggingface chat function such as temperature and sampling parameters .

since the text prompt in HF convert to a msgs, dose the lmdeploy minicpm2.5 follow this or I should make a msgs out of the lmdeploy pipe func?

question = 'What is in the image?'
msgs = [{'role': 'user', 'content': question}]

res = model.chat(
    image=image,
    msgs=msgs,
    tokenizer=tokenizer,
    sampling=True, # if sampling=False, beam_search will be used by default
    temperature=0.7,
    # system_prompt='' # pass system_prompt if needed
)
print(res)
irexyc commented 3 months ago

@AnyangAngus

For offline usage(pipeline api), you can refer to this docs For servering, we provide openai-compatible api usage, you can refer to this docs

AnyangAngus commented 3 months ago

@AnyangAngus

For offline usage(pipeline api), you can refer to this docs For servering, we provide openai-compatible api usage, you can refer to this docs

@irexyc

As for minicpm offline usage pipeline api, I observe that the backend_config and VisionConfig do not go into effect I reduce the cache_max_entry_count but the GPU memory still need 55G as the same as cache_max_entry_count=0.8 backend_config = TurbomindEngineConfig(cache_max_entry_count=0.2) pipe = pipeline('/models/MiniCPM-Llama3-V-2_5', backend_config=backend_config))

I set the batch size try to reduce the GPU memory, but I still can infer batch items and the GPU memory dose not change. vision_config=VisionConfig(max_batch_size=1) pipe = pipeline('/models/MiniCPM-Llama3-V-2_5', vision_config=vision_config)

Is the way I passed the parameters incorrect? and How can I pass this parameters to Minicpm2,5 correctly? Many Thanks

QwertyJack commented 3 months ago

@AnyangAngus This issue has been identified and has been resolved through fix #1778.

WANGSSSSSSS commented 3 months ago

which branch should i use for suppoorting the minicpm-v-2.5?

lvhan028 commented 3 months ago

the latest main