Nixiesearch is a hybrid search engine that fine-tunes to your data.
status: red
on your cluster.Want to learn more? Go straight to the quickstart.
Unlike some of the other vector search engines:
The project is in active development and not intended for production use just yet. Stay tuned and reach out if you want to try it!
Nixiesearch has the following design limitations:
Get the sample MS MARCO dataset:
curl -L -O http://nixiesearch.ai/data/msmarco.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 162 100 162 0 0 3636 0 --:--:-- --:--:-- --:--:-- 3681
100 32085 100 32085 0 0 226k 0 --:--:-- --:--:-- --:--:-- 226k
Run the Nixiesearch docker container:
docker run -i -t -p 8080:8080 nixiesearch/nixiesearch:latest standalone
12:40:47.325 INFO ai.nixiesearch.main.Main$ - Staring Nixiesearch
12:40:47.460 INFO ai.nixiesearch.config.Config$ - No config file given, using defaults
12:40:47.466 INFO ai.nixiesearch.config.Config$ - Store: LocalStoreConfig(LocalStoreUrl(/))
12:40:47.557 INFO ai.nixiesearch.index.IndexRegistry$ - Index registry initialized: 0 indices, config: LocalStoreConfig(LocalStoreUrl(/))
12:40:48.253 INFO o.h.blaze.server.BlazeServerBuilder -
███╗ ██╗██╗██╗ ██╗██╗███████╗███████╗███████╗ █████╗ ██████╗ ██████╗██╗ ██╗
████╗ ██║██║╚██╗██╔╝██║██╔════╝██╔════╝██╔════╝██╔══██╗██╔══██╗██╔════╝██║ ██║
██╔██╗ ██║██║ ╚███╔╝ ██║█████╗ ███████╗█████╗ ███████║██████╔╝██║ ███████║
██║╚██╗██║██║ ██╔██╗ ██║██╔══╝ ╚════██║██╔══╝ ██╔══██║██╔══██╗██║ ██╔══██║
██║ ╚████║██║██╔╝ ██╗██║███████╗███████║███████╗██║ ██║██║ ██║╚██████╗██║ ██║
╚═╝ ╚═══╝╚═╝╚═╝ ╚═╝╚═╝╚══════╝╚══════╝╚══════╝╚═╝ ╚═╝╚═╝ ╚═╝ ╚═════╝╚═╝ ╚═╝
12:40:48.267 INFO o.h.blaze.server.BlazeServerBuilder - http4s v1.0.0-M38 on blaze v1.0.0-M38 started at http://0.0.0.0:8080/
Build an index for a hybrid search:
curl -XPUT -d @msmarco.json http://localhost:8080/msmarco/_index
{"result":"created","took":8256}
Send the search query:
curl -XPOST -d '{"query": {"match": {"text":"new york"}},"fields": ["text"]}'\
http://localhost:8080/msmarco/_search
{
"took": 13,
"hits": [
{
"_id": "8035959",
"text": "Climate & Weather Averages in New York, New York, USA.",
"_score": 0.016666668
},
{
"_id": "2384898",
"text": "Consulate General of the Republic of Korea in New York.",
"_score": 0.016393442
},
{
"_id": "2241745",
"text": "This is a list of the tallest buildings in New York City.",
"_score": 0.016129032
}
}
You can also open http://localhost:8080/_ui
in your web browser for a basic web UI:
For more details, see a complete Quickstart guide.
Nixiesearch is inspired by an Amazon search engine design described in a talk E-Commerce search at scale on Apache Lucene:
Compared to traditional search engines like Elasticsearch/Solr:
Nixiesearch uses RRF for combining text and neural search results.
Nixiesearch is not a general-purpose search engine like Elasticsearch:
At the moment, Nixiesearch is in the process of active development, so please reach out to use via the contact form if you want to try it!
This project is released under the Apache 2.0 license, as specified in the License file.