Closed 04cfb1ed closed 6 days ago
Vertex Multimodal embeddings allows sending image, text and video
It can be used through REST endpoint or Vertex AI SDK in Python
https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-multimodal-embeddings#aiplatform_sdk_text_image_embedding-drest
Sample code from reference
import vertexai from vertexai.vision_models import Image, MultiModalEmbeddingModel # TODO(developer): Update values for project_id, image_path & contextual_text vertexai.init(project=project_id, location="us-central1") model = MultiModalEmbeddingModel.from_pretrained("multimodalembedding") image = Image.load_from_file(image_path) embeddings = model.get_embeddings( image=image, contextual_text=contextual_text, dimension=dimension, ) print(f"Image Embedding: {embeddings.image_embedding}") print(f"Text Embedding: {embeddings.text_embedding}")
Multimodal embeddings enable multimodal processing in RAG combining video, audio or images
No response
That's interesting. Happy to add support for this.
@04cfb1ed can we setup a 1:1 support channel? Noticed you'd had a couple issues, want to prioritize correctly for your use-case
LinkedIn Discord (just 👋 wave on #general and i'll setup a channel)
The Feature
Vertex Multimodal embeddings allows sending image, text and video
It can be used through REST endpoint or Vertex AI SDK in Python
https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-multimodal-embeddings#aiplatform_sdk_text_image_embedding-drest
Sample code from reference
Motivation, pitch
Multimodal embeddings enable multimodal processing in RAG combining video, audio or images
Twitter / LinkedIn details
No response