rails / sdoc

Standalone sdoc generator
http://api.rubyonrails.org/
Other
823 stars 131 forks source link

Switch from bigrams to trigrams for search #342

Closed jonathanhefner closed 8 months ago

jonathanhefner commented 9 months ago

Trigrams can provide more accurate search results than bigrams. For example, using bigrams, searching for "sel" would attempt to match the ngrams " s", "se", and "el". For the Rails API (at 7c65a4b83b583f4f), the top result is ActiveModel::Serializers due to "Model" matching "el" and ":Serial" matching " s" and "se". However, using trigrams, "sel" would attempt to match " se" and "sel". In that case, for the Rails API, the top result is ActiveRecord::QueryMethods#select.

The downside to using trigrams is that the search index increases from 2.9 MB to 8.6 MB. But the data compresses well, so when gzipped the size only increases from 474 kB to 670 kB. And browser heap snapshot size stays reasonably small, increasing from 6.8 MB to 11.1 MB in Firefox and 8.0 MB to 22.2 MB in Chrome.