NuGet / NuGetGallery

NuGet Gallery is a package repository that powers https://www.nuget.org. Use this repo for reporting NuGet.org issues.
https://www.nuget.org/
Apache License 2.0
1.54k stars 644 forks source link

[Azure Search] Consider shingling #7390

Open loic-sharma opened 5 years ago

loic-sharma commented 5 years ago

Target queries:

This MAY be helped:

loic-sharma commented 5 years ago

Changes here: https://github.com/NuGet/NuGet.Services.Metadata/compare/dev...loshar-shingles?expand=1

Results: this improved "mysql", "fxcop", and "entityframework" queries, however this regressed the "aspnet" query. The queries that were improved are more common, so this seems like a good tradeoff. We think the "aspnet" query may be improved by the "edge dependencies" work, so we want to try that first before continuing this change.

loic-sharma commented 5 years ago

As part of this work, consider folding IdentifierCustomTokenFilter into PackageIdCustomTokenizer.

loic-sharma commented 5 years ago

@chgill-MSFT suggested that we prioritize fields that aren't shingled over shingled fields. This may help the case where searching for aspnet return AspNetCore results instead of AspNet results.