Azure / azure-sdk

This is the Azure SDK parent repository and mostly contains documentation around guidelines and policies as well as the releases for the various languages supported by the Azure SDK.
http://azure.github.io/azure-sdk
MIT License
487 stars 297 forks source link

Board Review: Search #1041

Closed bryevdv closed 4 years ago

bryevdv commented 4 years ago

The Basics

About this client library

Artifacts required (per language)

We use an API review tool (apiview) to support .NET and Java API reviews. For Python and TypeScript, use the API extractor tool, then submit the output as a Draft PR to the relevant repository (azure-sdk-for-python or azure-sdk-for-js).

.NET

Java

Python

JavaScript / TypeScript

https://github.com/Azure/azure-sdk-for-js/pull/7614

https://gist.github.com/xirzec/17c8192e41a8cd40ceba82c5c39339f2#scenarios

Champion Scenarios

The context of all the following scenario is:

Customer runs a web store. As part of the store, they have a product search capability.

Search for a documents in an Azure Search index with simple text, and get the first result

Customer may wish to perform a query with basic search text and obtain the first result returned to display to their users.

Filter search results from an Azure Search index

Customers may wish to afford the option to filter search results by specific conditions, order the results in specified ways, or limit returned results to a subset of fields.

Get a list of search suggestions

Customers may wish to ask for a list of search suggestions based on a given search text, in order to guide users search experience.

Upload documents to an Azure Search index

Customers may wish to add new documents to a search index (being informed if a given document already exists), e.g. to represent new products they are making available that, that their users should able to find.

Merge or upload a document in an Azure Search index (with a new field)

Customers may have a set of documents, which may or may not already exist in the search index, that they wish to incorporate into the index, .e.g if changes to existing product descriptions need to be made available to their users.

Batch CRUD operations on documents

Customer may wish to mass-update many documents of their catalogue at once, in the most efficient way possible.

Agenda for the review

A board review is generally split into two parts, with additional meetings as required

Part 1 - Introducing the board to the service:

After part 1, you may schedule additional meetings with architects to refine the API and work on implementation.

Part 2 - the "GA" meeting

Thank you for your submission

adrianhall commented 4 years ago

Scheduled for Wednesday Mar-4

adrianhall commented 4 years ago

Champion Scenarios:

TABLED: Customizable JSON serializer within .NET (to support JSON.NET)

ACTION: Work on the case of partial success/failure. @brjohnstmsft has details on how it worked in track-1 & Java.

Recording: https://msit.microsoftstream.com/video/1cb4f2c1-0cf4-4633-aee4-7e90ec9e6501

bterlson commented 4 years ago

@xirzec and I discussed a possible alternative to the PagedAsyncIterable approach.

First, some possible guideline modifications: We are thinking that PagedAsyncIterable isn't appropriate if the following are true:

  1. you usually want pages, because PagedAsyncIterable puts per-item iterators front-and-center, and hides pages behind byPage().
  2. you usually just want the first page.

​1 is likely the case because users will often want facets, and #​2 is likely true because we've heard from the service that most result sets fit in a single response.

Therefore, we could consider an approach that hides the server-driven paging entirely (or, as an "advanced API") and instead expose a more typical client-driven paging model. Users pass skip and top, and if a user requests top N where N is larger than what fits in a single response, we will request the next page for them (which, we've heard, will be rare in practice).

tg-msft commented 4 years ago

Search Paging

We had a chat to follow up on the discussion from https://github.com/Azure/azure-sdk/issues/1041 and the comments from TS folks at https://github.com/Azure/azure-sdk/issues/1041#issuecomment-594941663.

The TS folks suggested an inversion of the async paging approach we've taken for other services as the default given that most users want a single page. Krzysztof doesn't want to invent new patterns, but felt the key for his understanding was considering this to be a tuple of facet info and search results. Bruce helped us understand some of the pain points between client-side paging (which we want to make easy) and server-side paging (which ideally we'll hide as much as possible as an implementation detail). We threw out a lot of ideas and settled here. There was some concern about the "double await" but nobody thought customers would get hung up there.

C# and TS will try to switch to this approach for Preview 1. Java and Python can wait for a future preview.

C

We settled on something pretty close to what's already in the ApiView listing. We'll have APIs that look like:

public Response<SearchResults> Search(...);
public async Task<Response<SearchResults>> SearchAsync(...);

public class SearchResults<T>
{
    public IDictionary<string, ICollection<FacetResult>> Facets { get; }
    public double? Coverage { get; }
    public long? TotalCount { get; }

    public AsyncPageable<SearchResult<T>> GetResultsAsync();
    public Pageable<SearchResult<T>> GetResults();
}

And would use that via:

var results = await client.SearchAsync(...);
// render results.Facets
await foreach (var item in results.GetResultsAsync())
{
    // render search item
}

TS

The same approach would look like this in in TS:

// proposal one:
const result = await client
  .search({
    searchText: "WiFi",
    facets: ["Category,count:3,sort:count", "Rooms/BaseRate,interval:100"]
  });

console.log(result.facets);

// result.results() returns an PagedAsyncIterator of SearchResult, SearchResult[]
for await (const searchResult of result.results()) {
    // do something with searchResult
    console.log(searchResult);
}
xirzec commented 4 years ago

The above is now implemented in Azure/azure-sdk-for-js#7641 with one minor change: result.results is the iterator itself, not a method to return an iterator:

for await (const searchResult of result.results) {
    // do something with searchResult
    console.log(searchResult);
}