Camel-AI currently utilizes OpenAIMessage for message passing, which supports text and image content. However, as open-source multi-modal large language models (InternLM-XComposer) continue to evolve, there is a growing need for a more versatile message structure.
This issue proposes the creation of a new message data structure that can accommodate:
Text content
Images
Video
, in order to enable compatibility with advanced multi-modal models.
Required prerequisites
Motivation
Camel-AI currently utilizes OpenAIMessage for message passing, which supports text and image content. However, as open-source multi-modal large language models (InternLM-XComposer) continue to evolve, there is a growing need for a more versatile message structure.
This issue proposes the creation of a new message data structure that can accommodate:
, in order to enable compatibility with advanced multi-modal models.
Solution
No response
Alternatives
No response
Additional context
No response