mosuka / phalanx

Phalanx is a cloud-native distributed search engine that provides endpoints through gRPC and traditional RESTful API.
Apache License 2.0
362 stars 27 forks source link
cloud-native distributed engine go golang gossip-protocol grpc object-storage restful-api search

Phalanx

Phalanx is a cloud-native distributed search engine written in Go built on top of Bluge that provides endpoints through gRPC and traditional RESTful API.
Phalanx implements a cluster formation by hashicorp/memberlist and managing index metadata on etcd, so it is easy to bring up a fault-tolerant cluster.
Metrics for system operation can also be output in Prometheus exposition format, so that monitoring can be done immediately using Prometheus.
Phalanx is using object storage for the storage layer, it is only responsible for the computation layer, such as indexing and retrieval processes. Therefore, scaling is easy, and you can simply add new nodes to the cluster.
Currently, it is an alpha version and only supports Amazon S3 and MinIO as the storage layer, but in the future it will support Google Cloud Storage, and Azure Blob Storage.

Architecture

Phalanx is a master node-less distributed search engine that separates the computation layer for searching and indexing from the storage layer for persisting the index. The storage layer is designed to use object storage on public clouds such as Amazon S3, Google Cloud Storage, and Azure Blob Storage.

Phalanx makes it easy to bring up a distributed search engine cluster. A phalanx cluster simply adds nodes when its resources are run out. Of course, it can also simply shut down nodes that are not needed. Indexes are managed by object storage, so there is no need to worry about index placement. No complex operations are required. Clusters are very flexible and scalable.

Phalanx stores index metadata in etcd. The metadata stores the index and the path of the shards under that index. The nodes process the distributed index based on the metadata stored in etcd.

Phalanx also uses etcd as a distributed lock manager to ensure that updates to a single shard are not made on multiple nodes at the same time.

phalanx_architecture

Build

Building Phalanx as following:

% git clone https://github.com/mosuka/phalanx.git
% cd phalanx
% make build

You can see the binary file when build successful like so:

% ls ./bin
phalanx

Table of Contents