staticbackendhq / core

Backend server API handling user mgmt, database, storage and real-time component
https://staticbackend.com
MIT License
682 stars 67 forks source link

Implement a full-text search capability #64

Closed dstpierre closed 1 year ago

dstpierre commented 1 year ago

A full-text catalog is a useful feature for most application.

This is a proposal of how I'd see this feature being added.

Overall concept

To keep things simple I think SB would handle 1 reserve table/collection that supports FTS. Its schema could be similar to this:

Name: sb_fts (this would be on all tenant database)

id:string auto id
accountId:string the account owner
referenceId:string the id this item refers to
searchField:string the fts content to search in
previewData: JSON representation of a tiny "view" of the referenced object

An database would have only one table to perform full-text search.

The full-text index would index what's inside the searchField

The caller would receive a previewData list of all matches. This is useful since most full-text search output to a search results usually. Having a quick way to display the important data for matches is handy.

The referenceId is the id of the referenced entity. Continuing our analogy of a search result, one could build a URL from this ID to load the entire entity.

The "how"

PostgreSQL and MongoDB support full-text search natively. If SQLite is implemented (#63 ) it's also supported in SQLite.

The memory database impl would most certainly not offer FTS. Or if time allow, we could leverage an in-memory text search. TBD.

We could have a simple Search function in the Persister interface. Each database provider would implement their own version.

We'd also require a function like IndexContent (name to be refine) in the Persister interface.

An example of those function prototypes could be:

// in model
type FullTextData struct {
  ID string `json:"id"`
  AccountID string `json:"accountId"
  ReferenceID string `json:"refId"
  SearchField string `json:"searchField"
  PreviewData map[string]any `json:"previewData"
}
func Search(auth model.Auth, search string) ([]FullTextData, error)
func IndexContent(auth model.Auth, data FullTextData) error

To be careful

Since the sb_fts hold preview data of real entities in other table/collection they'd need to reflect updated values and be removed when the main entity is deleted.

At this moment, I'm not certain if this would be the responsibility of the dev or SB.

The deverloper could listen to database events and create functions that react to updates and deletes and apply the desire changes to the fts table/collection.

If it's SB's responsibility, I don't see how it can know about the schema the previewData should have.

Some ideas:

  1. Maybe it can perform a get by id and updates the previewData map with matching keys from the updated document. This remove SB from having to know anything about user's data.

For now, that's the only way that comes to mind. TBD.

API endpoint

This could be a reserved word like /db/fts this means that a user could never have a table/collection called fts, maybe could be sbfts. TBD.

This would need to be added to all client library as well as the backend package for Go devs using SB directly.

Deploying those changes

This will most certainly require a new SQL migration for PostgreSQL. And since the sb_fts table is defined in "user land", this migration would need to add it for all existing database (PG schema).

This will be the first time a changes need to update all users' databases. This will require testing before going into production.