HackYourFuture-CPH / simply-name.it

Final Project for Class17
MIT License
3 stars 1 forks source link

Backend: Elasticsearch-powered user search #124

Closed orhantoy closed 2 years ago

orhantoy commented 3 years ago

This follows up on https://github.com/HackYourFuture-CPH/simply-name.it/issues/79 to integrate with Elasticsearch to deliver a more powerful search experience.

The overall steps needed to integrate with Elasticsearch are:

Note: the above steps could be further split into individual GitHub issues and this issue here could serve as a kind of parent issue.

Setup index

Example mapping:

PUT staging-simply-name-it-users
{
  "mappings": {
    "dynamic": "strict",
    "properties": {
      "fullName": {
        "type": "search_as_you_type"
      },
      "email": {
        "type": "keyword"
      },
      "profilePictureUrl": {
        "index": false,
        "type": "keyword"
      }
    }
  }
}

Docs Example

Search

Example search query:

GET staging-simply-name-it-users/_search
{
  "query": {
    "multi_match": {
      "query": "brown f",
      "type": "bool_prefix",
      "fields": [
        "fullName",
        "fullName._2gram",
        "fullName._3gram"
      ]
    }
  }
}

Links

Enhancement ideas

Midnighter commented 3 years ago
* Daily sync job to make sure the Elasticsearch index is in sync with the database.

I'm curious about this point. Is that current best practice? I have previously used elasticsearch in a way that updates the index on every create, update, delete.

orhantoy commented 3 years ago

@Midnighter That's also what is described here

Ingest data into index

  • Update when updated in DB
  • Create when created in DB
  • Delete when deleted in DB

and that approach should be enough if there are retries (in case of failures while making requests to Elasticsearch). If there is a lack of retrying, a daily sync job would be a "nice to have", depending on how critical it is.