Auto-suggest queries impose a pre-filter limit to constrain the number of documents that may be scored (since scoring is expensive). However this is imposed only after computing the set of all documents, via the union of various DocIdSets. If these constituent DocIdSets are very large, computing their union may itself be unreasonably expensive and cause large amounts of allocation (before potentially hitting the limit and returning no results anyway).
This patch detects such cases early:
processAutosuggestQuery will return no results if any of the constituent sets is too large (and hence the union would fail the pre-filter limit anyway). This does not change semantics.
mkAutosuggestQuery will ignore preceding terms that occur in too many documents. This does change the results, as explained in the comments, but I think it is the best we can do.
Auto-suggest queries impose a pre-filter limit to constrain the number of documents that may be scored (since scoring is expensive). However this is imposed only after computing the set of all documents, via the union of various DocIdSets. If these constituent DocIdSets are very large, computing their union may itself be unreasonably expensive and cause large amounts of allocation (before potentially hitting the limit and returning no results anyway).
This patch detects such cases early:
processAutosuggestQuery will return no results if any of the constituent sets is too large (and hence the union would fail the pre-filter limit anyway). This does not change semantics.
mkAutosuggestQuery will ignore preceding terms that occur in too many documents. This does change the results, as explained in the comments, but I think it is the best we can do.