lablup / raftify

Experimental High level Raft framework
https://docs.rs/raftify/latest/raftify
Apache License 2.0
36 stars 14 forks source link
distributed distributed-systems python raft raft-framework rust

Raftify

⚠️ WARNING: This library is in a very experimental stage. The API could be broken.

Raftify is a high-level implementation of Raft, developed with the goal of making it easy and straightforward to integrate the Raft algorithm.

It uses tikv/raft-rs and gRPC for the network layer and heed (LMDB wrapper) for the storage layer.

Quick guide

I strongly recommend to read the basic memstore example code to get how to use this library for starters, but here's a quick guide.

Define your own log entry

Define the data to be stored in LogEntry and how to serialize and deserialize it.

#[derive(Clone, Debug, Serialize, Deserialize)]
pub enum LogEntry {
    Insert { key: u64, value: String },
}

impl AbstractLogEntry for LogEntry {
    fn encode(&self) -> Result<Vec<u8>> {
        serialize(self).map_err(|e| e.into())
    }

    fn decode(bytes: &[u8]) -> Result<LogEntry> {
        let log_entry: LogEntry = deserialize(bytes)?;
        Ok(log_entry)
    }
}

Define your application Raft FSM

Essentially, the following three methods need to be implemented for the Store.

And also similarly to LogEntry, you need to implement encode and decode.

#[derive(Clone, Debug)]
pub struct HashStore(pub Arc<RwLock<HashMap<u64, String>>>);

impl HashStore {
    pub fn new() -> Self {
        Self(Arc::new(RwLock::new(HashMap::new())))
    }

    pub fn get(&self, id: u64) -> Option<String> {
        self.0.read().unwrap().get(&id).cloned()
    }
}

#[async_trait]
impl AbstractStateMachine for HashStore {
    async fn apply(&mut self, data: Vec<u8>) -> Result<Vec<u8>> {
        let log_entry: LogEntry = LogEntry::decode(&data)?;
        match log_entry {
            LogEntry::Insert { ref key, ref value } => {
                let mut db = self.0.write().unwrap();
                log::info!("Inserted: ({}, {})", key, value);
                db.insert(*key, value.clone());
            }
        };
        Ok(data)
    }

    async fn snapshot(&self) -> Result<Vec<u8>> {
        Ok(serialize(&self.0.read().unwrap().clone())?)
    }

    async fn restore(&mut self, snapshot: Vec<u8>) -> Result<()> {
        let new: HashMap<u64, String> = deserialize(&snapshot[..]).unwrap();
        let mut db = self.0.write().unwrap();
        let _ = std::mem::replace(&mut *db, new);
        Ok(())
    }

    fn encode(&self) -> Result<Vec<u8>> {
        serialize(&self.0.read().unwrap().clone()).map_err(|e| e.into())
    }

    fn decode(bytes: &[u8]) -> Result<Self> {
        let db: HashMap<u64, String> = deserialize(bytes)?;
        Ok(Self(Arc::new(RwLock::new(db))))
    }
}

Bootstrap a raft cluster

First bootstrap the cluster that contains the leader node.

let raft_addr = "127.0.0.1:60061".to_owned();
let node_id = 1;

let log_storage = HeedStorage::create(&storage_pth, &raft_config.clone(), logger.clone())
    .expect("Failed to create heed storage");

let raft = Raft::bootstrap(
    node_id,
    raft_addr,
    log_storage,
    store.clone(),
    raft_config,
    logger.clone(),
)?;

tokio::spawn(raft.clone().run());

// ...
tokio::try_join!(raft_handle)?;

Join follower nodes to the cluster

Then join the follower nodes.

If peer specifies the configuration of the initial members, the cluster will operate after all member nodes are bootstrapped.

let raft_addr = "127.0.0.1:60062".to_owned();
let peer_addr = "127.0.0.1:60061".to_owned();
let join_ticket = Raft::request_id(raft_addr, peer_addr).await;

let log_storage = HeedStorage::create(&storage_pth, &raft_config.clone(), logger.clone())
    .expect("Failed to create heed storage");

let raft = Raft::bootstrap(
    join_ticket.reserved_id,
    raft_addr,
    log_storage,
    store.clone(),
    raft_config,
    logger.clone(),
)?;

let raft_handle = tokio::spawn(raft.clone().run());
raft.join_cluster(vec![join_ticket]).await?;

// ...
tokio::try_join!(raft_handle)?;

Manipulate FSM by RaftServiceClient

If you want to operate the FSM remotely, you can use RaftServiceClient.

let mut leader_client = create_client(&"127.0.0.1:60061").await.unwrap();

leader_client
    .propose(raft_service::ProposeArgs {
        msg: LogEntry::Insert {
            key: 1,
            value: "test".to_string(),
        }
        .encode()
        .unwrap(),
    })
    .await
    .unwrap();

Manipulate FSM by RaftNode

If you want to operate FSM locally, use the RaftNode type of the Raft object.

raft.propose(LogEntry::Insert {
    key: 123,
    value: "test".to_string(),
}.encode().unwrap()).await;

Debugging

You can use a collection of CLI commands that let you inspect the data persisted in stable storage and the status of Raft Servers.

❯ raftify-cli describe logs ./logs/node-1
┌───────┬────────────┬──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬──────┐
│ index ┆    type    ┆                                                             data                                                             ┆ term │
╞═══════╪════════════╪══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╪══════╡
│   2   ┆ ConfChange ┆ ConfChangeV2 { transition: 0, changes: [ConfChangeSingle { change_type: AddNode, node_id: 2 }], context: [127.0.0.1:60062] } ┆   1  │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│   3   ┆ ConfChange ┆ ConfChangeV2 { transition: 0, changes: [ConfChangeSingle { change_type: AddNode, node_id: 3 }], context: [127.0.0.1:60063] } ┆   1  │
└───────┴────────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴──────┘
❯ raftify-cli describe metadata ./logs/node-1
┌───────────┬─────────────────────────────────────┬──────────────────────────┐
│           ┆ Field                               ┆ Value                    │
╞═══════════╪═════════════════════════════════════╪══════════════════════════╡
│ HardState ┆ term                                ┆ 1                        │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│           ┆ vote                                ┆ 1                        │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│           ┆ commit                              ┆ 3                        │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ ConfState ┆ voters                              ┆ {1, 2, 3}                │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│           ┆ learners                            ┆ {}                       │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│           ┆ voters_outgoing                     ┆ {}                       │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│           ┆ learners_next                       ┆ {}                       │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│           ┆ auto_leave                          ┆ false                    │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ Snapshot  ┆ data                                ┆ [0, 0, 0, 0, 0, 0, 0, 0] │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│           ┆ metadata.index                      ┆ 3                        │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│           ┆ metadata.term                       ┆ 1                        │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│           ┆ metadata.conf_state.voters          ┆ {1, 2, 3}                │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│           ┆ metadata.conf_state.learners        ┆ {}                       │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│           ┆ metadata.conf_state.voters_outgoing ┆ {}                       │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│           ┆ metadata.conf_state.learners_next   ┆ {}                       │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│           ┆ metadata.conf_state.auto_leave      ┆ false                    │
├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ LastIndex ┆ last index                          ┆ 3                        │
└───────────┴─────────────────────────────────────┴──────────────────────────┘

Bootstrapping from WAL

If there are previous logs remaining in the log directory, the raft node will automatically apply them after the node is bootstrapped. If you intend to bootstrap the cluster from the scratch, please remove the previous log directory. To ignore the previous logs and bootstrap the cluster from a snapshot, use the Config.bootstrap_from_snapshot option.

Support for other languages

Raftify provides bindings for the following languages.

Building from Source

If you want to build Raftify from the source code or set up a development environment, please refer to the DEVELOPMENT.md.

References

Raftify was inspired by a wide variety of previous Raft implementations.

Great thanks to all the relevant developers.