Azure / azure-sdk-for-js

This repository is for active development of the Azure SDK for JavaScript (NodeJS & Browser). For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/javascript/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-js.
MIT License
2.09k stars 1.2k forks source link

Azure AI Search 2024-11-01-Preview #31235

Open alzimmermsft opened 1 month ago

alzimmermsft commented 1 month ago

This issue outlines the work to be done for the Search 2024-11-01-Preview release.

These are the links to generate from:

https://raw.githubusercontent.com/Azure/azure-rest-api-specs/14531a7cf6101c1dd57e7c1c83103a047bb8f5bb/specification/search/data-plane/Azure.Search/preview/2024-11-01-preview/searchindex.json https://raw.githubusercontent.com/Azure/azure-rest-api-specs/14531a7cf6101c1dd57e7c1c83103a047bb8f5bb/specification/search/data-plane/Azure.Search/preview/2024-11-01-preview/searchservice.json

Feature: Hierarchical Aggregation and Facet Filtering

API changes

On request, there are no API changes. Facets are exposed as Strings in the API and this feature is an update to the syntax used to allow for hierarchical aggregation and facet filtering. On response, FacetResult will now be a recursive data structure where in addition to the previous count: int64 there will now be a facets: Map<String, FacetResult>.

Caption and Answer length in Semantic Search

API changes

On request, this is an update to the custom classes added to support Semantic Search Answers and Captions to expose the configuration settings as strongly typed values rather than the pipe (|) delimited format the wire request uses. This will be an update to those classes to add maxcharlength: int32 and have those classes create the pipe delimited string correctly.

Runtime notes

This feature is only available where the query type is semantic.

Query rewriting for improved L1 retrieval

API changes

On request, this is an update to both VectorQuery and SearchRequest to support a new pipe delimited string that indicates how many additional queries the Search service will generate based on the given query to improve L1 retrieval. For this feature we should create a custom helper class similar to what was added for Semantic Search Answers and Captions that provides a strongly typed way to configure this feature. Name proposal is QueryRewrite with configuration values of type: QueryRewritesType and count: int32.

On response, this adds a new field for SemanticQueryRewriteResultType which is metadata about how the Search service ran the search request and updates to DebugInfo to include QueryRewritesDebugInfo to include debugging data about how the query rewrites modified the search request.

Runtime notes

This feature is only available where the query type is semantic.

Vector Search HNSW storage optimization

API changes

When configuring vector compressions there is a new configuration to determine how the Search service should manage the original vectors (the vector before compression). This is a new field on the configuration.

Runtime notes

If the vector compression is configured to discard the original vector this means the Search service is unable to rescore results with the original vector. This means the enablement of rescore cannot happen with discard is chosen. Additionally, the storing options for the original vectors cannot be changed after creation.

Markdown parsing mode

API changes

Two new settings for IndexingParametersConfiguration were added for MarkdownParsingSubmode and MarkdownHeaderDepth. Historically, indexing configurations was just additional properties, then typeness was added, this may require updates to pull these new values out of additional properties used in older versions.

Azure Document Intelligence skill

API changes

This is a new skill added for Document Intelligence. Should be simple regeneration with no manual changes.

AI Services bill to sub-domain

API changes

New subtypes AIServicesAccountKey and AIServicesAccountIdentity for CognitiveServicesAccount have been added which can tell the Search service which subdomain to bill. Should be a simple regeneration with no manual changes.

In addition to the new API changes, there is one Swagger fix that needs to be applied with a Swagger transform:

### Fix `SearchResult["@search.documentDebugInfo"]`
``` yaml $(tag) == 'searchindex'
directive:
  - from: swagger-document
    where: $.definitions.SearchResult.properties
    transform: >
      $["@search.documentDebugInfo"]["$ref"] = $["@search.documentDebugInfo"].items["$ref"];
      delete $["@search.documentDebugInfo"].type;
      delete $["@search.documentDebugInfo"].items;