mosuka / blast

Blast is a full text search and indexing server, written in Go, built on top of Bleve.
Apache License 2.0
1.08k stars 76 forks source link

Consensus Protocol Implement method? #118

Closed Peipeilvcm closed 4 years ago

Peipeilvcm commented 4 years ago

I know from Readme, the cluster is built based on Raft consensus algorithm.

But when I try to use the cluster mode, when I kill the leader node, re-election didn't happened.

Does Blast support leader re-election?

And for consensus, The write operation(indexing, PUT) should only happen on Leader node. I use http indexing request to follower node when leader has been killed, it still works well, so I am a little confused. Can write operation work on followers?

If write operation can work on followers, when different write operations happen at the same time to different nodes, consensus and sequence may not be guaranteed

mosuka commented 4 years ago

Hi @Peipeilvcm , Thank you for your report. Unfortunately, it could not occur in my environment. I tried the following command. Can you share your commands?

Start indexer1:

$ ./bin/blast indexer start \
    --grpc-address=:5000 \
    --grpc-gateway-address=:6000 \
    --http-address=:8000 \
    --node-id=indexer1 \
    --node-address=:2000 \
    --data-dir=/tmp/blast/indexer1 \
    --raft-storage-type=boltdb \
    --index-mapping-file=./example/wiki_index_mapping.json \
    --index-type=upside_down \
    --index-storage-type=boltdb

Start indexer2:

$ ./bin/blast indexer start \
    --peer-grpc-address=:5000 \
    --grpc-address=:5010 \
    --grpc-gateway-address=:6010 \
    --http-address=:8010 \
    --node-id=indexer2 \
    --node-address=:2010 \
    --data-dir=/tmp/blast/indexer2 \
    --raft-storage-type=boltdb

Start indexer3:

$ ./bin/blast indexer start \
    --peer-grpc-address=:5000 \
    --grpc-address=:5020 \
    --grpc-gateway-address=:6020 \
    --http-address=:8020 \
    --node-id=indexer3 \
    --node-address=:2020 \
    --data-dir=/tmp/blast/indexer3 \
    --raft-storage-type=boltdb

You can see cluster info as following:

$ ./bin/blast indexer cluster info --grpc-address=:5000 | jq .
{
  "cluster": {
    "nodes": {
      "indexer1": {
        "id": "indexer1",
        "bind_address": ":2000",
        "state": 3,
        "metadata": {
          "grpc_address": ":5000",
          "grpc_gateway_address": ":6000",
          "http_address": ":8000"
        }
      },
      "indexer2": {
        "id": "indexer2",
        "bind_address": ":2010",
        "state": 1,
        "metadata": {
          "grpc_address": ":5010",
          "grpc_gateway_address": ":6010",
          "http_address": ":8010"
        }
      },
      "indexer3": {
        "id": "indexer3",
        "bind_address": ":2020",
        "state": 1,
        "metadata": {
          "grpc_address": ":5020",
          "grpc_gateway_address": ":6020",
          "http_address": ":8020"
        }
      }
    }
  }
}

Stop indexer1, the leader, and check the cluster information using the following command:

$ ./bin/blast indexer cluster info --grpc-address=:5010 | jq .
{
  "cluster": {
    "nodes": {
      "indexer1": {
        "state": 4,
        "metadata": {}
      },
      "indexer2": {
        "id": "indexer2",
        "bind_address": ":2010",
        "state": 3,
        "metadata": {
          "grpc_address": ":5010",
          "grpc_gateway_address": ":6010",
          "http_address": ":8010"
        }
      },
      "indexer3": {
        "id": "indexer3",
        "bind_address": ":2020",
        "state": 1,
        "metadata": {
          "grpc_address": ":5020",
          "grpc_gateway_address": ":6020",
          "http_address": ":8020"
        }
      }
    }
  }
}

You can see one of the followers has become a leader ("state": 3) and indexer1 has gone ("state": 4) .

Restart indexer1 as follower:

$ ./bin/blast indexer start \
    --peer-grpc-address=:5010 \
    --grpc-address=:5000 \
    --grpc-gateway-address=:6000 \
    --http-address=:8000 \
    --node-id=indexer1 \
    --node-address=:2000 \
    --data-dir=/tmp/blast/indexer1 \
    --raft-storage-type=boltdb

Then, check the cluster information using the following command:

$ ./bin/blast indexer cluster info --grpc-address=:5010 | jq .
{
  "cluster": {
    "nodes": {
      "indexer1": {
        "id": "indexer1",
        "bind_address": ":2000",
        "state": 1,
        "metadata": {
          "grpc_address": ":5000",
          "grpc_gateway_address": ":6000",
          "http_address": ":8000"
        }
      },
      "indexer2": {
        "id": "indexer2",
        "bind_address": ":2010",
        "state": 3,
        "metadata": {
          "grpc_address": ":5010",
          "grpc_gateway_address": ":6010",
          "http_address": ":8010"
        }
      },
      "indexer3": {
        "id": "indexer3",
        "bind_address": ":2020",
        "state": 1,
        "metadata": {
          "grpc_address": ":5020",
          "grpc_gateway_address": ":6020",
          "http_address": ":8020"
        }
      }
    }
  }
}

You can see indexer1 has become a follower ("state": 1).

Peipeilvcm commented 4 years ago

@mosuka

start cluster as followings:

sudo docker run -d --name blast-indexer1 --net=host -v $PWD/example:/opt/blast/example \
mosuka/blast:latest blast indexer start \
--grpc-address=:5001 \
--grpc-gateway-address=:6000 \
--http-address=:8000 \
--node-id=blast-indexer1 \
--node-address=:2000 \
--data-dir=/tmp/blast/indexer1 \
--raft-storage-type=boltdb \
--index-mapping-file=/opt/blast/example/A_mapping_for_logging.json \
--index-type=scorch \
--index-storage-type=scorch

sudo docker run -d --name blast-indexer2 --net=host \
mosuka/blast:latest blast indexer start \
--peer-grpc-address=:5001 \
--grpc-address=:5010 \
--grpc-gateway-address=:6010 \
--http-address=:8010 \
--node-id=indexer2 \
--node-address=:2010 \
--data-dir=/tmp/blast/indexer2 \
--raft-storage-type=boltdb

sudo docker run -d --name blast-indexer3 --net=host \
mosuka/blast:latest blast indexer start \
--peer-grpc-address=:5001 \
--grpc-address=:5020 \
--grpc-gateway-address=:6020 \
--http-address=:8020 \
--node-id=indexer3 \
--node-address=:2020 \
--data-dir=/tmp/blast/indexer3 \
--raft-storage-type=boltdb

kill indexer1:

sudo docker rm blast-indexer1
sudo docker stop blast-indexer2

Indexing a document to indexer2 by curl

curl -X PUT 'http://127.0.0.1:6010/v1/documents/enwiki_1' -H 'Content-Type: application/json' --data-binary '
{
  "fields": {
    "title_en": "Search engine (computing)",
    "text_en": "A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information which must be consulted, akin to other techniques for managing information overload. The most public, visible form of a search engine is a Web search engine which searches for information on the World Wide Web.",
    "timestamp": "2018-07-04T05:41:00Z",
    "_type": "enwiki"
  }
}
'

The document can be got from indexer2 & indexer3

$ curl -X GET 'http://127.0.0.1:6020/v1/documents/enwiki_1' -H 'Content-Type: application/json' | jq . 
{
  "fields": {
    "_type": "enwiki",
    "text_en": "A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information which must be consulted, akin to other techniques for managing information overload. The most public, visible form of a search engine is a Web search engine which searches for information on the World Wide Web.",
    "timestamp": "2018-07-04T05:41:00Z",
    "title_en": "Search engine (computing)"
  }
}

indexing a document to indexer3 by curl

curl -X PUT 'http://127.0.0.1:6010/v1/documents/enwiki_2' -H 'Content-Type: application/json' --data-binary '
{
  "fields": {
    "title_en": "Search engine (computing)",
    "text_en": "A saasascafa",
    "timestamp": "2018-07-04T05:41:00Z",
    "_type": "enwiki"
  }
}
'

It works well, indexer2 & indexer3 can be both writen.

Thanks.

Peipeilvcm commented 4 years ago

@mosuka no reply?

mosuka commented 4 years ago

Hi @Peipeilvcm , Sorry for the late reply. Data replication was completed by restarting indexer1 as follows:

sudo docker run -d --name blast-indexer1 --net=host -v $PWD/example:/opt/blast/example \
mosuka/blast:latest blast indexer start \
--peer-grpc-address=:5010 \
--grpc-address=:5001 \
--grpc-gateway-address=:6000 \
--http-address=:8000 \
--node-id=blast-indexer1 \
--node-address=:2000 \
--data-dir=/tmp/blast/indexer1 \
--raft-storage-type=boltdb

It may take some time to elect a leader. Can you check it?

mosuka commented 4 years ago

@Peipeilvcm

I added a docker-compose.yml. If you use Docker to build a cluster, please refer to following as well: https://github.com/mosuka/blast#running-cluster-on-docker-compose

Peipeilvcm commented 4 years ago

Thanks for your replying. But there is an issue as following:

@mosuka , can indexing happen on follower nodes? In my operations above, I can index a document enwiki_2(or maybe enwiki_1) on follower node by curl.

mosuka commented 4 years ago

In Blast, update requests received by followers are forwarded to the leader once. The followers do not update the data directly.