Hirevo / alexandrie

An alternative crate registry, implemented in Rust.
https://hirevo.github.io/alexandrie/
Apache License 2.0
493 stars 55 forks source link

Add tantivy index #150

Closed Dalvany closed 1 year ago

Dalvany commented 1 year ago

This pull request aims to address #19 issue by adding a Tantivy index. While there is still room for improvement, it might be a first step.

Configuration

Add a section [search] with a field directory that contains where Tantivy should store its files.

[search]
directory = "/tmp/alexandrie/tantivy

Search

As it uses QueryParser you can use full Tantivy query language.

By default search use all fields except name.prefix (see below) and suggester search amongst name, name.full and name.prefix (see below).

Implementation

fts module

This module contains all full text search related structures.

Tantivy structure handles all boiler plate to setup an index, search and suggest. It also delegate method to index document, commit documents.

TantivyDocument is a structure that represents a crate and can be converted into a Tantivy's Document

Indices

Crate's name are index multiple times to improve both result relevance of suggester and search.

Other fields that are indexed :

Note that at search time, we should not apply apply the edge ngram filter to reduce noise.

How to index

When Alexandrie starts, it index everything.

Things that still need work