Closed MrCsabaToth closed 4 weeks ago
It is supported already since v0.4.0 (https://github.com/google-gemini/generative-ai-dart/pull/149)
Oh I searched within the issues but not the PRs! So it's part of the content request and not the model instantiation, this way it can be changed dynamically per request!
Description of the feature request:
Since
text-embedding-004
the API calls supportoutputDimensionality
parameter. That truncates the vector to the given size. See https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api#advanced-useWhat problem are you trying to solve with this feature?
Reduce the storage size of the vectors as a trade-off for some accuracy / precision
Any other information you'd like to share?
Partial workaround: since it sounds like there's no PCA (Principal Component Analysis) going on for example for 256 dimension, it's a simple truncation, until this is supported on the Dart API someone can perform the truncation themselves. Then the only thing which remain in that case is additional bandwidth prominent with batch inferences.