modelscope / ms-swift

Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
https://swift.readthedocs.io/zh-cn/latest/Instruction/index.html
Apache License 2.0
4.19k stars 369 forks source link

internVL2-8b使用openai API方式调用无法识别图片 #1406

Closed xierbut closed 3 months ago

xierbut commented 3 months ago

使用swift文档MLLM部署篇中的代码部署internVL2-8b后无法识别图片,代码如下:

from openai import OpenAI client = OpenAI( api_key='EMPTY', base_url='http://localhost:10000/v1', ) model_type = client.models.list().data[0].id print(f'model_type: {model_type}') query = """Picture 1:https://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/rose.jpg 图中是什么花,有几只?"""

messages = [{ 'role': 'user', 'content': query }] resp = client.chat.completions.create( model=model_type, messages=messages, seed=42) response = resp.choices[0].message.content print(f'query: {query}') print(f'response: {response}')

流式

messages.append({'role': 'assistant', 'content': response}) query = '框出图中的花' messages.append({'role': 'user', 'content': query}) stream_resp = client.chat.completions.create( model=model_type, messages=messages, stream=True, seed=42)

print(f'query: {query}') print('response: ', end='') for chunk in stream_resp: print(chunk.choices[0].delta.content, end='', flush=True) print()

输出如下: image 但是采用直接推理方式是可以进行图片问答的。 Package Version


absl-py 2.1.0 accelerate 0.31.0 addict 2.4.0 aiobotocore 2.7.0 aiofiles 23.2.1 aiohttp 3.9.5 aioitertools 0.11.0 aioprometheus 23.12.0 aiosignal 1.3.1 aitemplate 0.0.1+das1.1.git5d8aa20.dtk2404.torch2.1.0 aliyun-python-sdk-core 2.15.1 aliyun-python-sdk-kms 2.16.3 altair 5.3.0 annotated-types 0.7.0 anyio 4.4.0 apex 1.1.0+das1.1.gitf477a3a.abi1.dtk2404.torch2.1.0 argon2-cffi 23.1.0 argon2-cffi-bindings 21.2.0 arrow 1.3.0 arxiv 2.1.0 asttokens 2.4.1 async-lru 2.0.4 async-timeout 4.0.3 attrdict 2.0.1 attrs 23.2.0 Babel 2.15.0 backports.strenum 1.3.1 bcrypt 4.1.3 beautifulsoup4 4.12.3 binpacking 1.5.2 bitsandbytes 0.42.0+das1.1.gitce85679.abi1.dtk2404.torch2.1.0 bleach 6.1.0 blinker 1.8.2 boto3 1.34.144 botocore 1.34.144 cachetools 5.3.3 certifi 2024.6.2 cffi 1.16.0 charset-normalizer 3.3.2 click 8.1.7 cloudpickle 3.0.0 cn2an 0.5.22 colorama 0.4.6 coloredlogs 15.0.1 comm 0.2.2 contourpy 1.2.1 cpm-kernels 1.0.11 crcmod 1.7 cryptography 42.0.8 cycler 0.12.1 dacite 1.8.1 datasets 2.18.0 debugpy 1.8.2 decorator 5.1.1 decord 0.6.0 deepspeed 0.12.3+gita724046.abi1.dtk2404.torch2.1.0 defusedxml 0.7.1 diffusers 0.25.0 dill 0.3.8 diskcache 5.6.3 distro 1.9.0 dnspython 2.6.1 docstring_parser 0.16 dropout-layer-norm 0.1+das1.1gitc7a8c18.abi1.dtk2404.torch2.1 ecdsa 0.19.0 editdistance 0.8.1 einops 0.5.0 email_validator 2.1.1 et-xmlfile 1.1.0 evaluate 0.4.2 exceptiongroup 1.2.1 executing 2.0.1 fairscale 0.4.13 fastapi 0.110.3 fastjsonschema 2.20.0 fastpt 1.0.0+das1.1.abi1.dtk2404 feedparser 6.0.10 ffmpy 0.3.2 filelock 3.15.1 fire 0.6.0 flash-attn 2.0.4 flatbuffers 24.3.25 fonttools 4.53.0 fqdn 1.5.1 frozenlist 1.4.1 fsspec 2023.10.0 func_timeout 4.3.5 fused-dense-lib 0.1+das1.1gitc7a8c18.abi1.dtk2404.torch2.1 future 1.0.0 fuzzywuzzy 0.18.0 gitdb 4.0.11 GitPython 3.1.43 gradio 4.26.0 gradio_client 0.15.1 griffe 0.48.0 grpcio 1.65.0 h11 0.14.0 hjson 3.1.0 httpcore 1.0.5 httptools 0.6.1 httpx 0.27.0 huggingface-hub 0.23.4 humanfriendly 10.0 hypothesis 5.35.1 idna 3.7 imageio 2.34.2 immutabledict 4.2.0 importlib_metadata 7.1.0 importlib_resources 6.4.0 interegular 0.3.3 ipykernel 6.29.5 ipython 8.26.0 ipywidgets 8.1.3 isoduration 20.11.0 jedi 0.19.1 jieba 0.42.1 Jinja2 3.1.4 jmespath 0.10.0 joblib 1.4.2 json5 0.9.25 jsonlines 4.0.0 jsonpointer 3.0.0 jsonschema 4.22.0 jsonschema-specifications 2023.12.1 jupyter 1.0.0 jupyter_client 8.6.2 jupyter-console 6.6.3 jupyter_core 5.7.2 jupyter-events 0.10.0 jupyter-lsp 2.2.5 jupyter_server 2.14.1 jupyter_server_terminals 0.5.3 jupyterlab 4.2.3 jupyterlab_pygments 0.3.0 jupyterlab_server 2.27.2 jupyterlab_widgets 3.0.11 kiwisolver 1.4.5 lagent 0.2.2 langdetect 1.0.9 lark 1.1.9 layer-check-pt 1.2.3.git59a087a.abi1.dtk2404.torch2.1.0 lazy_loader 0.4 Levenshtein 0.25.1 lightop 0.4+das1.1git8e60f07.abi1.dtk2404.torch2.1 llmuses 0.4.1 llvmlite 0.43.0 lmdeploy 0.2.6+das1.1.git6ba90df.abi1.dtk2404.torch2.1.0 loguru 0.7.2 ltp 4.2.14 ltp-core 0.1.4 ltp-extension 0.1.13 lxml 5.2.2 Markdown 3.6 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matplotlib 3.9.0 matplotlib-inline 0.1.7 mdurl 0.1.2 mistune 3.0.2 mmcv 2.0.1+das1.1.gite58da25.abi1.dtk2404.torch2.1.0 mmengine 0.10.4 mmengine-lite 0.10.4 modelscope 1.16.0 mpmath 1.3.0 ms-opencompass 0.0.1 ms-swift 2.2.2 msgpack 1.0.8 multidict 6.0.5 multiprocess 0.70.16 nbclient 0.10.0 nbconvert 7.16.4 nbformat 5.10.4 nest-asyncio 1.6.0 networkx 3.3 ninja 1.11.1.1 nltk 3.8 notebook 7.2.1 notebook_shim 0.2.4 numba 0.60.0 numpy 1.24.3 onnxruntime 1.15.0+das1.1.git739f24d.abi1.dtk2404 openai 1.35.13 OpenCC 1.1.7 opencv-contrib-python 4.10.0.84 opencv-python 4.10.0.82 opencv-python-headless 4.10.0.84 openpyxl 3.1.5 optimum 1.21.2 orjson 3.10.5 oss2 2.18.6 outlines 0.0.46 overrides 7.7.0 packaging 24.1 pandas 1.5.3 pandocfilters 1.5.1 parso 0.8.4 passlib 1.7.4 peft 0.11.1 pexpect 4.9.0 phx-class-registry 4.1.0 pillow 10.3.0 pip 24.0 platformdirs 4.2.2 plotly 5.22.0 ply 3.11 portalocker 2.10.1 prettytable 3.10.2 proces 0.1.7 prometheus_client 0.20.0 prompt_toolkit 3.0.47 protobuf 4.25.3 psutil 5.9.8 ptyprocess 0.7.0 pure-eval 0.2.2 py-cpuinfo 9.0.0 pyairports 2.1.1 pyarrow 16.1.0 pyarrow-hotfix 0.6 pyasn1 0.6.0 pycountry 24.6.1 pycparser 2.22 pycryptodome 3.20.0 pydantic 2.7.4 pydantic_core 2.18.4 pydantic-settings 2.3.4 pydeck 0.9.1 pydub 0.25.1 pyext 0.7 Pygments 2.18.0 Pympler 1.1 pynvml 11.5.0 pyparsing 3.1.2 pypinyin 0.51.0 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 python-jose 3.3.0 python-json-logger 2.0.7 python-Levenshtein 0.25.1 python-multipart 0.0.9 pytz 2024.1 PyYAML 6.0.1 pyzmq 26.0.3 qtconsole 5.5.2 QtPy 2.4.1 quantile-python 1.1 rank-bm25 0.2.2 rapidfuzz 3.9.4 ray 2.9.1 referencing 0.35.1 regex 2024.5.15 requests 2.31.0 requests-toolbelt 1.0.0 rfc3339-validator 0.1.4 rfc3986-validator 0.1.1 rich 13.7.1 rotary-emb 0.1+das1.1gitc7a8c18.abi1.dtk2404.torch2.1 rouge 1.0.1 rouge-chinese 1.0.3 rouge_score 0.1.2 rpds-py 0.18.1 rsa 4.9 ruff 0.5.1 s3fs 2023.10.0 s3transfer 0.10.2 sacrebleu 2.4.2 safetensors 0.4.3 scikit-image 0.24.0 scikit-learn 1.2.1 scipy 1.13.1 seaborn 0.13.2 semantic-version 2.10.0 Send2Trash 1.8.3 sentence-transformers 2.2.2 sentencepiece 0.2.0 setuptools 65.5.0 sgmllib3k 1.0.0 shellingham 1.5.4 shortuuid 1.0.13 shtab 1.7.1 simple-ddl-parser 1.5.1 simplejson 3.19.2 six 1.16.0 smmap 5.0.1 sniffio 1.3.1 sortedcontainers 2.4.0 soupsieve 2.5 sse-starlette 2.1.2 stack-data 0.6.3 starlette 0.37.2 streamlit 1.36.0 sympy 1.12.1 tabulate 0.9.0 tblib 3.0.0 tenacity 8.4.2 tensorboard 2.17.0 tensorboard-data-server 0.7.2 termcolor 2.4.0 terminado 0.18.1 threadpoolctl 3.5.0 tifffile 2024.7.2 tiktoken 0.7.0 timeout-decorator 0.5.0 timm 1.0.7 tinycss2 1.3.0 tokenizers 0.19.1 toml 0.10.2 tomli 2.0.1 tomlkit 0.12.0 toolz 0.12.1 torch 2.1.0+das1.1.git3ac1bdd.abi1.dtk2404 torchaudio 2.1.2+das1.1.git63d9a68.abi1.dtk2404.torch2.1.0 torchvision 0.16.0+das1.1.git7d45932.abi1.dtk2404.torch2.1 tornado 6.4.1 tqdm 4.64.1 traitlets 5.14.3 transformers 4.42.4 transformers-stream-generator 0.0.5 triton 2.1.0+das1.1.git4bf1007a.abi1.dtk2404.torch2.1.0 trl 0.9.6 typer 0.11.1 types-python-dateutil 2.9.0.20240316 typing_extensions 4.12.2 tyro 0.8.5 tzdata 2024.1 ujson 5.10.0 uri-template 1.3.0 urllib3 2.0.7 uvicorn 0.30.1 uvloop 0.19.0 vllm 0.3.3+das1.1.gitdf6349c.abi1.dtk2404.torch2.1.0 watchdog 4.0.1 watchfiles 0.22.0 wcwidth 0.2.13 webcolors 24.6.0 webencodings 0.5.1 websocket-client 1.8.0 websockets 11.0.3 Werkzeug 3.0.3 widgetsnbextension 4.0.11 wrapt 1.16.0 xentropy-cuda-lib 0.1+das1.1gitc7a8c18.abi1.dtk2404.torch2.1 xformers 0.0.25+das1.1.git8ef8bc1.abi1.dtk2404.torch2.1.0 xinference 0.13.0 xoscar 0.3.2 xtuner 0.1.21 xxhash 3.4.1 yapf 0.40.2 yarl 1.9.4 zipp 3.19.2

Jintao-Huang commented 3 months ago

参考这里哈:https://github.com/modelscope/swift/blob/main/docs/source/Multi-Modal/MLLM%E9%83%A8%E7%BD%B2%E6%96%87%E6%A1%A3.md#minicpm-v-v2_5-chat

xierbut commented 3 months ago

感谢答复,可以使用了。