langchain-ai / langchainjs

🦜🔗 Build context-aware reasoning applications 🦜🔗
https://js.langchain.com/docs/
MIT License
11.79k stars 1.97k forks source link

No `inputImage` property on Bedrock embedding #5971

Open bmichotte opened 4 days ago

bmichotte commented 4 days ago

Checked other resources

Example Code

Based on https://js.langchain.com/v0.2/docs/integrations/text_embedding/bedrock

import { BedrockEmbeddings } from "@langchain/community/embeddings/bedrock";

const embeddings = new BedrockEmbeddings({
  region: process.env.BEDROCK_AWS_REGION!,
  credentials: {
    accessKeyId: process.env.BEDROCK_AWS_ACCESS_KEY_ID!,
    secretAccessKey: process.env.BEDROCK_AWS_SECRET_ACCESS_KEY!,
  },
  model: "amazon.titan-embed-text-v1", // Default value
});

const res = await embeddings.embedQuery(
  "What would be a good company name a company that makes colorful socks?"
);
console.log({ res });

Error Message and Stack Trace (if applicable)

No response

Description

Bedrock gives the possibility to get embeddings for images using the amazon.titan-embed-image-v1 model

{
  "modelId": "amazon.titan-embed-image-v1",
  "contentType": "application/json",
  "accept": "application/json",
  "body": {
    "inputText": "this is where you place your input text",
    "inputImage": "<base64_image_string>"
  }
}

Atm, there's no way to give that inputImage

On https://github.com/langchain-ai/langchainjs/blob/main/libs/langchain-community/src/embeddings/bedrock.ts we could add a new parameter image to embedQuery such as

-  embedQuery(document: string): Promise<number[]> {
+  embedQuery(document: string, image: string | undefined = undefined): Promise<number[]> {
    return this.caller.callWithOptions(
      {},
      this._embedText.bind(this),
      document,
+     image
    );
  }

and modify the _embedText for

-protected async _embedText(text: string): Promise<number[]> {
+protected async _embedText(text: string, image: string | undefined = undefined): Promise<number[]> {
...
          new InvokeModelCommand({
            modelId: this.model,
            body: JSON.stringify({
              inputText: cleanedText,
+             inputImage: image,
            }),

System Info

langchain 0.2.8 platform mac Node version : 20.12.2 pnpm version : 9.4.0

dosubot[bot] commented 4 days ago

Hey there, @bmichotte! I'm here to help you with any bugs, questions, or contributions you might have. Let's tackle this together!

To address the issue of embedding images using the amazon.titan-embed-image-v1 model in the BedrockEmbeddings class, you can modify the class to include support for image inputs. Here is how you can do it:

  1. Add a new method _embedImage to handle image embedding.
  2. Modify the constructor to set the default model to amazon.titan-embed-image-v1 if not provided.
  3. Add a new method embedImage to expose the image embedding functionality.

Here is the modified code:

import {
  BedrockRuntimeClient,
  InvokeModelCommand,
} from "@aws-sdk/client-bedrock-runtime";
import { Embeddings, EmbeddingsParams } from "@langchain/core/embeddings";
import type { CredentialType } from "../utils/bedrock/index.js";

/**
 * Interface that extends EmbeddingsParams and defines additional
 * parameters specific to the BedrockEmbeddings class.
 */
export interface BedrockEmbeddingsParams extends EmbeddingsParams {
  model?: string;
  client?: BedrockRuntimeClient;
  region?: string;
  credentials?: CredentialType;
}

/**
 * Class that extends the Embeddings class and provides methods for
 * generating embeddings using the Bedrock API.
 */
export class BedrockEmbeddings
  extends Embeddings
  implements BedrockEmbeddingsParams
{
  model: string;
  client: BedrockRuntimeClient;
  batchSize = 512;

  constructor(fields?: BedrockEmbeddingsParams) {
    super(fields ?? {});
    this.model = fields?.model ?? "amazon.titan-embed-text-v1";
    this.client =
      fields?.client ??
      new BedrockRuntimeClient({
        region: fields?.region,
        credentials: fields?.credentials,
      });
  }

  protected async _embedText(text: string): Promise<number[]> {
    return this.caller.call(async () => {
      try {
        const cleanedText = text.replace(/\n/g, " ");
        const res = await this.client.send(
          new InvokeModelCommand({
            modelId: this.model,
            body: JSON.stringify({
              inputText: cleanedText,
            }),
            contentType: "application/json",
            accept: "application/json",
          })
        );
        const body = new TextDecoder().decode(res.body);
        return JSON.parse(body).embedding;
      } catch (e) {
        console.error({ error: e });
        if (e instanceof Error) {
          throw new Error(
            `An error occurred while embedding documents with Bedrock: ${e.message}`
          );
        }
        throw new Error(
          "An error occurred while embedding documents with Bedrock"
        );
      }
    });
  }

  protected async _embedImage(imageBase64: string): Promise<number[]> {
    return this.caller.call(async () => {
      try {
        const res = await this.client.send(
          new InvokeModelCommand({
            modelId: this.model,
            body: JSON.stringify({
              inputImage: imageBase64,
            }),
            contentType: "application/json",
            accept: "application/json",
          })
        );
        const body = new TextDecoder().decode(res.body);
        return JSON.parse(body).embedding;
      } catch (e) {
        console.error({ error: e });
        if (e instanceof Error) {
          throw new Error(
            `An error occurred while embedding images with Bedrock: ${e.message}`
          );
        }
        throw new Error(
          "An error occurred while embedding images with Bedrock"
        );
      }
    });
  }

  embedQuery(document: string): Promise<number[]> {
    return this.caller.callWithOptions(
      {},
      this._embedText.bind(this),
      document
    );
  }

  embedImage(imageBase64: string): Promise<number[]> {
    return this.caller.callWithOptions(
      {},
      this._embedImage.bind(this),
      imageBase64
    );
  }

  async embedDocuments(documents: string[]): Promise<number[][]> {
    return Promise.all(documents.map((document) => this._embedText(document)));
  }
}

In this code, the _embedImage method is added to handle image embedding using the amazon.titan-embed-image-v1 model. The embedImage method is exposed to allow users to embed images by providing a base64 encoded image string [1].

To continue talking to Dosu, mention @dosu.