feat: Multi-modal completion thanks to Gemeni Generative AI

You can test with the following flow that will show how to make a multi-modal completion using Gemeni Generative AI.

id: gemini-completion
namespace: dev

inputs:
  - name: image
    type: FILE
  - name: video
    type: FILE

tasks
  - id: ask-for-jokes
    type: io.kestra.plugin.gcp.vertexai.MultimodalCompletion
    region: <regionId>
    projectId: <projectId>
    contents:
      - content: Can you tell me a good joke?

  - id: describe-image
    type: io.kestra.plugin.gcp.vertexai.MultimodalCompletion
    region: <regionId>
    projectId: <projectId>
    contents:
      - content: Can you describe this image?
      - mimeType: image/jpeg
        content: "{{ inputs.image }}"

  - id: describe-video
    type: io.kestra.plugin.gcp.vertexai.MultimodalCompletion
    region: <regionId>
    projectId: <projectId>
    contents:
      - content: Can you describe this video?
      - mimeType: video/mpeg4
        content: "{{ inputs.video }}"

To trigger a blocked response, you can ask to describe a harmful image (like an identity card) or for a sarcastic joke.

kestra-io / plugin-gcp

feat: Multi-modal completion thanks to Gemeni Generative AI #303