QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Other
5.03k stars 382 forks source link

[BUG] API 调用怎么传入图片 #210

Open ydh10002023 opened 10 months ago

ydh10002023 commented 10 months ago

API 调用怎么传入图片,像下面从网页输入

图片
ydh10002023 commented 10 months ago

@jklj077 can you help this? thanks

jinze1994 commented 10 months ago

可以参考文档:https://help.aliyun.com/zh/dashscope/developer-reference/vl-plus-quick-start

示例:

from http import HTTPStatus
import dashscope

def simple_multimodal_conversation_call():
    """Simple single round multimodal conversation call.
    """
    messages = [
        {
            "role": "user",
            "content": [
                {"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"},
                {"text": "这是什么?"}
            ]
        }
    ]
    response = dashscope.MultiModalConversation.call(model='qwen-vl-plus',
                                                     messages=messages)
    # The response status_code is HTTPStatus.OK indicate success,
    # otherwise indicate request is failed, you can get error code
    # and message from code and message.
    if response.status_code == HTTPStatus.OK:
        print(response)
    else:
        print(response.code)  # The error code.
        print(response.message)  # The error message.

if __name__ == '__main__':
    simple_multimodal_conversation_call()
ydh10002023 commented 10 months ago

@jinze1994 thanks. image的路径是网络路径,如果是本地图片,需要上传到网络吗?

ShuaiBai623 commented 10 months ago
from dashscope import MultiModalConversation

def call_with_local_file():
    """Sample of use local file.
       linux&mac file schema: file:///home/images/test.png
       windows file schema: file://D:/images/abc.png
    """
    local_file_path = 'file://The_local_absolute_file_path'
    messages = [{
        'role': 'system',
        'content': [{
            'text': 'You are a helpful assistant.'
        }]
    }, {
        'role':
        'user',
        'content': [
            {
                'image': local_file_path
            },
            {
                'text': '图片里有什么东西?'
            },
        ]
    }]
    response = MultiModalConversation.call(model='qwen-vl-plus', messages=messages)
    print(response)

if __name__ == '__main__':
    call_with_local_file()
ShuaiBai623 commented 10 months ago

@ydh10002023 more details in https://help.aliyun.com/zh/dashscope/developer-reference/tongyi-qianwen-vl-plus-api?spm=a2c4g.11186623.0.0.416523edx3zH7H

timmykkk commented 4 months ago

请问下不支持传base64吗

linzm1007 commented 3 months ago
from dashscope import MultiModalConversation

def call_with_local_file():
    """Sample of use local file.
       linux&mac file schema: file:///home/images/test.png
       windows file schema: file://D:/images/abc.png
    """
    local_file_path = 'file://The_local_absolute_file_path'
    messages = [{
        'role': 'system',
        'content': [{
            'text': 'You are a helpful assistant.'
        }]
    }, {
        'role':
        'user',
        'content': [
            {
                'image': local_file_path
            },
            {
                'text': '图片里有什么东西?'
            },
        ]
    }]
    response = MultiModalConversation.call(model='qwen-vl-plus', messages=messages)
    print(response)

if __name__ == '__main__':
    call_with_local_file()

你这个是阿里云的模型,如果我自己部署的qwen-vl api 怎么上传图片