This pull request introduces the aggregationMethod parameter to the DocumentSimilarityRanker annotator. The new parameter allows users to specify the method used to aggregate multiple sentence embeddings into a single vector representation.
Motivation and Context
Allows users to tailor the aggregation method to their specific use case, whether they need a general overview (AVERAGE), focus on the initial context (FIRST), or emphasize the strongest signals (MAX).
This change solves the following issues:
Issue #14368
Issue #14195
How Has This Been Tested?
Screenshots (if appropriate):
Local Tests
Google Colab notebook
Types of changes
[x] Bug fix (non-breaking change which fixes an issue)
[x] Code improvements with no or little impact
[ ] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
Checklist:
[x] My code follows the code style of this project.
[x] My change requires a change to the documentation.
Description
This pull request introduces the
aggregationMethod
parameter to theDocumentSimilarityRanker
annotator. The new parameter allows users to specify the method used to aggregate multiple sentence embeddings into a single vector representation.Motivation and Context
Allows users to tailor the aggregation method to their specific use case, whether they need a general overview (AVERAGE), focus on the initial context (FIRST), or emphasize the strongest signals (MAX).
This change solves the following issues:
How Has This Been Tested?
Screenshots (if appropriate):
Types of changes
Checklist: