Open AmazDeng opened 4 hours ago
For online serving, we followed the OpenAI format which accept url and base64 data.
For offline usage (pipeline), you could actually pass messages like:
img = Image.open('...')
messages = [
dict(role='user', content=[
dict(type='text', text='Describe the images in detail.'),
dict(type='image_url', image_url=dict(url=img))
])
]
pipe(messages)
Understood, thank you.
@irexyc @lvhan028 @AllentDan
In my study of the lmdeploy framework, I found that for video inference, the framework first converts video frames from PIL.Image.Image objects to base64-encoded strings outside the framework. Then, inside the framework, it decodes the base64 strings back into PIL.Image.Image objects. Why doesn’t the lmdeploy framework directly accept a List[PIL.Image.Image] as input when do video inference? This conversion process is actually quite time-consuming.