Open danielballan opened 10 months ago
In discussions with @Kezzsim, we are going ahead with TypeSense, as an optional add-on in the same way that Prometheus is an optional add-on.
I think that this will involve:
typesense
to the Catalog constructors, which takes None
(default---no typense) or a config dict like{
'api_key': 'Hu52dwsas2AdxdE',
'nodes': [{
'host': 'localhost',
'port': '8108',
'protocol': 'http'
}],
'connection_timeout_seconds': 2
}
Tiled config like:
trees:
- tree: catalog
args:
uri: postgresql+asyncpg://...
typesense:
api_key: $TYPESENSE_API_KEY
nodes:
- host: localhost
port: 8108
protocol: http
connection_timeout_seconds: 2
will just work, with no code changes to the config parser.
Context.__init__
and creating an instance of a typesense.Client
held as self.typesense_client
on the Context
.Also in Context.__init__
, registering [after_insert
] (https://docs.sqlalchemy.org/en/20/orm/events.html#sqlalchemy.orm.MapperEvents.after_insert) and after_update
SQLAlchemy events that make the relevant calls from self.typesense_client
. (I remain not entirely clear what these hooks give you access to, but the docs look promising.)
Adding a new module tiled.commandline._typesense
and updating tiled.commandline.main
to add a tiled typsense
subcommand to the CLI. I imagine we will need:
tiled typesense init TYPESENSE_URL [ANOTHER_TYPESENSE_URL] # define schemas
tiled typesense rebuild TYPESENSE_URL [ANOTHER_TYPESENSE_URL] # drop data (if any) and rebuild
The utility urllib.parse.urlparse
can be used to get from a CLI-friendly string like http://localhost:8108?api_key=Hu52dwsas2AdxdE
into the structure:
{
'api_key': '',
'nodes': [{
'host': 'localhost',
'port': '8108',
'protocol': 'http'
}],
'connection_timeout_seconds': 2
}
All of above is up for a rethink, just meant as a quick sketch to highlight the relevant sections of the Tiled code that I can see will need to be touched.
From discussion on 20 Feb:
typesense_ingestion:
- spec: BlueskyRun
fields:
- name: detectors # field name in TypeSense
path: "start.detectors" # path into Tiled JSON metadata
# Also type?
- spec: SomeOtherThing
...
# config.yml
authentication:
# The default is false. Set to true to enable any HTTP client that can
# connect to _read_. An API key is still required to write.
allow_anonymous_access: false
single_user_api_key: "secret" # for dev
trees:
- path: /
tree: catalog
args:
uri: "sqlite+aiosqlite:///:memory:"
# or, uri: "sqlite+aiosqlite:////catalog.db"
# or, "postgresql+asyncpg://..."
writable_storage: "data/"
init_if_not_exists: true
typesense_client:
schema:
connection_info:
$ tiled serve config config.yml
The built-in
MapAdapter
and externaldatabroker.mongo_normalized
adapter supports theFullText
query. We will add support forFullText
in the built-in SQL-backed Catalog Adapter in #456, #457 for SQLite and PostgreSQL respectively.Next, we should consider fuzzy match and search suggestions. This has often been done with the ELK stack, but that is a heavy stack to take on for the sake of just one of its features. What are our options?
@Kezzsim highlighted the project typesense, which is exactly targeted at serving this use case without taking on the weight of ELK.
Also, I believe there is some functionality in this space available in SQLite and PostgreSQL. While not at the level of ELK, it would be good to understand precisely how far we can get with the tech stack we already have, and what its limitations are.