[Feature Request] Enhance the Chat Message's compatibility with multimodal content (text, images, video)

Required prerequisites

[X] I have searched the Issue Tracker and Discussions that this hasn't already been reported. (+1 or comment there if it has.)
[X] Consider asking first in a Discussion.

Motivation

Camel-AI currently utilizes OpenAIMessage for message passing, which supports text and image content. However, as open-source multi-modal large language models (InternLM-XComposer) continue to evolve, there is a growing need for a more versatile message structure.

This issue proposes the creation of a new message data structure that can accommodate:

Text content
Images
Video

, in order to enable compatibility with advanced multi-modal models.

Solution

No response

Alternatives

No response

Additional context

No response

camel-ai / camel

[Feature Request] Enhance the Chat Message's compatibility with multimodal content (text, images, video) #806

Required prerequisites

Motivation

Solution

Alternatives

Additional context