langchain-ai / langchainjs

🦜🔗 Build context-aware reasoning applications 🦜🔗
https://js.langchain.com/docs/
MIT License
12.16k stars 2.05k forks source link

ElasticVectorSearch - search similarity without possibility to pass size to query #2229

Closed andersonnabas closed 1 year ago

andersonnabas commented 1 year ago

Add the possibility to pass the size parameter in the similarity searchs, eg: Class ElasticVectorSearch

It's always returning 10 documents (elastic's default conf)

const result = await this.client.search({
        ***size: value***,
        index: this.indexName,
        knn: {
            field: "embedding",
            query_vector: query,
            filter: this.buildMetadataTerms(filter),
            k,
            num_candidates: this.candidates,
        },
    });
dosubot[bot] commented 1 year ago

Answer generated by a 🤖

Answer

Hello,

Thank you for bringing this to our attention. You're correct that the current implementation of the similaritySearchVectorWithScore method in the ElasticVectorSearch class doesn't allow for the passing of the 'size' parameter. This is something that can be added to the method to allow for more flexibility in the number of documents returned.

Here's how you can modify the method to include the 'size' parameter:

async similaritySearchVectorWithScore(
    query: number[],
    k: number,
    filter?: ElasticFilter | undefined,
    size: number = 10 // default value is 10
  ): Promise<[Document, number][]> {
    const result = await this.client.search({
      index: this.indexName,
      size: size, // add size parameter here
      knn: {
        field: "embedding",
        query_vector: query,
        filter: this.buildMetadataTerms(filter),
        k,
        num_candidates: this.candidates,
      },
    });

    // eslint-disable-next-line @typescript-eslint/no-explicit-any
    return result.hits.hits.map((hit: any) => [
      new Document({
        pageContent: hit._source.text,
        metadata: hit._source.metadata,
      }),
      hit._score,
    ]);
  }

In the above code, I added a new parameter 'size' to the 'similaritySearchVectorWithScore' method and passed it to the search request. If no value is provided for 'size', it will default to 10.

This change should allow you to control the number of documents returned by the similaritySearchVectorWithScore method. If you have any further questions or run into any issues, please don't hesitate to let us know.

If you feel comfortable making these changes, we would welcome a pull request. Your contribution would be greatly appreciated by the LangChain community.

Best regards.

Sources


This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.