google-gemini / generative-ai-dart

The official Dart library for the Google Gemini API
https://ai.google.dev/gemini-api/docs/get-started/tutorial?lang=dart
Apache License 2.0
545 stars 111 forks source link

Support outputDimensionality reduction parameter for embedding models #208

Open MrCsabaToth opened 2 weeks ago

MrCsabaToth commented 2 weeks ago

Description of the feature request:

Since text-embedding-004 the API calls support outputDimensionality parameter. That truncates the vector to the given size. See https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api#advanced-use

{
  "instances": [
    { "content": "TEXT",
      "task_type": "TASK_TYPE",
      "title": "TITLE"
    },
  ],
  "parameters": {
    "autoTruncate": AUTO_TRUNCATE,
    "outputDimensionality": OUTPUT_DIMENSIONALITY
  }
}

What problem are you trying to solve with this feature?

Reduce the storage size of the vectors as a trade-off for some accuracy / precision

Any other information you'd like to share?

Partial workaround: since it sounds like there's no PCA (Principal Component Analysis) going on for example for 256 dimension, it's a simple truncation, until this is supported on the Dart API someone can perform the truncation themselves. Then the only thing which remain in that case is additional bandwidth prominent with batch inferences.

davidmigloz commented 2 weeks ago

It is supported already since v0.4.0 (https://github.com/google-gemini/generative-ai-dart/pull/149)

MrCsabaToth commented 2 weeks ago

Oh I searched within the issues but not the PRs! So it's part of the content request and not the model instantiation, this way it can be changed dynamically per request!