typesense / typesense

Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences
https://typesense.org
GNU General Public License v3.0
21.35k stars 663 forks source link

Node removing all peers when one peer is down #2076

Open MattySimpLush opened 1 week ago

MattySimpLush commented 1 week ago

Description

In a Kubernetes-deployed cluster, when one node goes down, the remaining nodes occasionally remove all peers and terminate abruptly. This behaviour disrupts the cluster’s stability and availability.

Steps to reproduce

  1. I have deployed it using these 3 ways and all result in the same behaviour:
  2. Run a 3 replica cluster (Also tested and replicated on a 5 node cluster)
  3. In production I have seen the error when a k8 node gets descheduled but this can be recreated by continously killing one of the pods like while true; do; kubectl delete pods -n typesense typesense-0; done
  4. Leave this script running and the other nodes in the cluster will start having errors and restarting.

Expected Behavior

When one node in a 3-replica cluster is lost:

Actual Behavior

Example Logs

  |   | 2024-11-21 15:02:00.430 | logtag="F" message="I20241121 15:02:00.430347   163 node.cpp:3185] node default_group:10.193.6.89:8107:8108 change_peers from 10.193.5.62:8107:8108,10.193.6.89:8107:8108,10.193.2.110:8107:8108 to , begin removing." time="2024-11-21T15:02:00.430472249Z" |  
  |   | 2024-11-21 15:02:00.430 | logtag="F" message="F20241121 15:02:00.430644   177 replicator.cpp:612] Check failed: !entry->peers->empty() log_index=7" time="2024-11-21T15:02:00.430819001Z" |  
  |   | 2024-11-21 15:02:00.430 | logtag="F" message="*** Check failure stack trace: ***" time="2024-11-21T15:02:00.430835716Z" |  
  |   | 2024-11-21 15:02:00.431 | logtag="F" message="F20241121 15:02:00.431208   181 replicator.cpp:612] Check failed: !entry->peers->empty() log_index=7" time="2024-11-21T15:02:00.431621159Z" |  
  |   | 2024-11-21 15:02:00.431 | logtag="F" message="*** Check failure stack trace: ***" time="2024-11-21T15:02:00.431642018Z" |  
  |   | 2024-11-21 15:02:00.933 | logtag="F" message="W20241121 15:02:00.933612   182 node.cpp:795] node default_group:10.193.6.89:8107:8108 term 7 steps down when alive nodes don't satisfy quorum dead_nodes:  conf: " time="2024-11-21T15:02:00.933779496Z" |  
  |   | 2024-11-21 15:02:00.933 | logtag="F" message="I20241121 15:02:00.933805   182 node.cpp:3298] node default_group:10.193.6.89:8107:8108 reset ConfigurationCtx, new_peers: , old_peers: 10.193.5.62:8107:8108,10.193.6.89:8107:8108,10.193.2.110:8107:8108" time="2024-11-21T15:02:00.933935615Z" |  
  |   | 2024-11-21 15:02:00.934 | logtag="F" message="I20241121 15:02:00.933965   182 node.cpp:3383] node default_group:10.193.6.89:8107:8108 majority readonly change from disable to  enable" time="2024-11-21T15:02:00.934024169Z" |  
  |   | 2024-11-21 15:02:00.934 | logtag="F" message="I20241121 15:02:00.933849   186 raft_server.h:281] Node stepped down : Majority of the group dies" time="2024-11-21T15:02:00.934086156Z" |  
  |   | 2024-11-21 15:02:01.689 | logtag="F" message="E20241121 15:02:01.689065   181 backward.hpp:4200] Stack trace (most recent call last) in thread 181:" time="2024-11-21T15:02:01.689243282Z" |  
  |   | 2024-11-21 15:02:01.689 | logtag="F" message="E20241121 15:02:01.689365   181 backward.hpp:4200] #10   Object "/opt/typesense-server", at 0x5907ac831640, in bthread_make_fcontext" time="2024-11-21T15:02:01.689447457Z" |  
  |   | 2024-11-21 15:02:01.689 | logtag="F" message="E20241121 15:02:01.689464   181 backward.hpp:4200] #9    Object "/opt/typesense-server", at 0x5907ac852af8, in bthread::TaskGroup::task_runner(long)" time="2024-11-21T15:02:01.6895421Z" |  
  |   | 2024-11-21 15:02:01.689 | logtag="F" message="E20241121 15:02:01.689618   181 backward.hpp:4200] #8    Object "/opt/typesense-server", at 0x5907ac5193ee, in braft::LogManager::run_on_new_log(void*)" time="2024-11-21T15:02:01.68968582Z" |  
  |   | 2024-11-21 15:02:01.689 | logtag="F" message="E20241121 15:02:01.689703   181 backward.hpp:4200] #7    Object "/opt/typesense-server", at 0x5907ac57a9df, in braft::Replicator::_continue_sending(void*, int)" time="2024-11-21T15:02:01.689788712Z" |  
  |   | 2024-11-21 15:02:01.689 | logtag="F" message="E20241121 15:02:01.689803   181 backward.hpp:4200] #6    Object "/opt/typesense-server", at 0x5907ac57a07f, in braft::Replicator::_send_entries()" time="2024-11-21T15:02:01.689888161Z" |  
  |   | 2024-11-21 15:02:01.689 | logtag="F" message="E20241121 15:02:01.689882   181 backward.hpp:4200] #5    Object "/opt/typesense-server", at 0x5907ac5798b0, in braft::Replicator::_prepare_entry(int, braft::EntryMeta*, butil::IOBuf*)" time="2024-11-21T15:02:01.689931202Z" |  
  |   | 2024-11-21 15:02:01.690 | logtag="F" message="E20241121 15:02:01.690047   181 backward.hpp:4200] #4    Object "/opt/typesense-server", at 0x5907acacbe69, in google::LogMessageFatal::~LogMessageFatal()" time="2024-11-21T15:02:01.690115239Z" |  
  |   | 2024-11-21 15:02:01.690 | logtag="F" message="E20241121 15:02:01.690132   181 backward.hpp:4200] #3    Object "/opt/typesense-server", at 0x5907acac839f, in google::LogMessage::Flush()" time="2024-11-21T15:02:01.690199626Z" |  
  |   | 2024-11-21 15:02:01.690 | logtag="F" message="E20241121 15:02:01.690220   181 backward.hpp:4200] #2    Object "/opt/typesense-server", at 0x5907acac8b6d, in google::LogMessage::SendToLog()" time="2024-11-21T15:02:01.690269685Z" |  
  |   | 2024-11-21 15:02:01.690 | logtag="F" message="E20241121 15:02:01.690395   181 backward.hpp:4200] #1    Object "/opt/typesense-server", at 0x5907acac8c11, in google::LogMessage::Fail()" time="2024-11-21T15:02:01.690469922Z" |  
  |   | 2024-11-21 15:02:01.690 | logtag="F" message="E20241121 15:02:01.690582   181 backward.hpp:4200] #0    Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7e06afc5b898, in abort" time="2024-11-21T15:02:01.690664615Z" |  
  |   | 2024-11-21 15:02:01.690 | logtag="F" message="Segmentation fault (Signal sent by the kernel [(nil)])" time="2024-11-21T15:02:01.690861279Z" |  
  |   | 2024-11-21 15:02:01.703 | logtag="F" message="E20241121 15:02:01.703531   177 backward.hpp:4200] Stack trace (most recent call last) in thread 177:" time="2024-11-21T15:02:01.703904234Z" |  
  |   | 2024-11-21 15:02:01.703 | logtag="F" message="E20241121 15:02:01.703555   177 backward.hpp:4200] #12   Object "/opt/typesense-server", at 0x5907ac831640, in bthread_make_fcontext" time="2024-11-21T15:02:01.703959717Z" |  
  |   | 2024-11-21 15:02:01.703 | logtag="F" message="E20241121 15:02:01.703570   177 backward.hpp:4200] #11   Object "/opt/typesense-server", at 0x5907ac852af8, in bthread::TaskGroup::task_runner(long)" time="2024-11-21T15:02:01.703988067Z" |  
  |   | 2024-11-21 15:02:01.703 | logtag="F" message="E20241121 15:02:01.703589   177 backward.hpp:4200] #10   Object "/opt/typesense-server", at 0x5907ac5193ee, in braft::LogManager::run_on_new_log(void*)" time="2024-11-21T15:02:01.703993966Z" |  
  |   | 2024-11-21 15:02:01.703 | logtag="F" message="E20241121 15:02:01.703611   177 backward.hpp:4200] #9    Object "/opt/typesense-server", at 0x5907ac57a9df, in braft::Replicator::_continue_sending(void*, int)" time="2024-11-21T15:02:01.703999213Z" |  
  |   | 2024-11-21 15:02:01.704 | logtag="F" message="E20241121 15:02:01.703632   177 backward.hpp:4200] #8    Object "/opt/typesense-server", at 0x5907ac57a07f, in braft::Replicator::_send_entries()" time="2024-11-21T15:02:01.704004629Z" |  
  |   | 2024-11-21 15:02:01.704 | logtag="F" message="E20241121 15:02:01.703662   177 backward.hpp:4200] #7    Object "/opt/typesense-server", at 0x5907ac5798b0, in braft::Replicator::_prepare_entry(int, braft::EntryMeta*, butil::IOBuf*)" time="2024-11-21T15:02:01.704010572Z" |  
  |   | 2024-11-21 15:02:01.704 | logtag="F" message="E20241121 15:02:01.703709   177 backward.hpp:4200] #6    Object "/opt/typesense-server", at 0x5907acacbe69, in google::LogMessageFatal::~LogMessageFatal()" time="2024-11-21T15:02:01.70401655Z" |  
  |   | 2024-11-21 15:02:01.704 | logtag="F" message="E20241121 15:02:01.703744   177 backward.hpp:4200] #5    Object "/opt/typesense-server", at 0x5907acac839f, in google::LogMessage::Flush()" time="2024-11-21T15:02:01.704051712Z" |  
  |   | 2024-11-21 15:02:01.704 | logtag="F" message="E20241121 15:02:01.703759   177 backward.hpp:4200] #4    Object "/opt/typesense-server", at 0x5907acac8b6d, in google::LogMessage::SendToLog()" time="2024-11-21T15:02:01.704058494Z" |  
  |   | 2024-11-21 15:02:01.704 | logtag="F" message="E20241121 15:02:01.703768   177 backward.hpp:4200] #3    Object "/opt/typesense-server", at 0x5907acac8c11, in google::LogMessage::Fail()" time="2024-11-21T15:02:01.704064394Z" |  
  |   | 2024-11-21 15:02:01.704 | logtag="F" message="E20241121 15:02:01.703776   177 backward.hpp:4200] #2    Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7e06afc5b7f2, in abort" time="2024-11-21T15:02:01.70406976Z" |  
  |   | 2024-11-21 15:02:01.704 | logtag="F" message="E20241121 15:02:01.703786   177 backward.hpp:4200] #1    Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7e06afc75475, in raise" time="2024-11-21T15:02:01.704075297Z" |  
  |   | 2024-11-21 15:02:01.704 | logtag="F" message="E20241121 15:02:01.703795   177 backward.hpp:4200] #0    Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7e06afcc99fc, in pthread_kill" time="2024-11-21T15:02:01.704080763Z" |  
  |   | 2024-11-21 15:02:01.704 | logtag="F" message="Aborted (Signal sent by tkill() 1 10000)" time="2024-11-21T15:02:01.704086711Z" |  
  |   | 2024-11-21 15:02:02.206 | logtag="F" message="W20241121 15:02:02.206269   186 socket.cpp:1340] Fail to wait EPOLLOUT of fd=32: Connection timed out [110]" time="2024-11-21T15:02:02.206432841Z" |  
  |   | 2024-11-21 15:02:02.314 | logtag="F" message="I20241121 15:02:02.314613   164 batched_indexer.cpp:428] Running GC for aborted requests, req map size: 0" time="2024-11-21T15:02:02.314795552Z" |  
  |   | 2024-11-21 15:02:02.856 | logtag="F" message="I20241121 15:02:02.856396   177 core_api.cpp:67] No in-flight search queries were found." time="2024-11-21T15:02:02.856591934Z" |  
  |   | 2024-11-21 15:02:02.856 | logtag="F" message="E20241121 15:02:02.856449   177 typesense_server.cpp:135] Typesense 27.1 is terminating abruptly." time="2024-11-21T15:02:02.856650965Z"

Metadata

Typesense Version: Observed on 0.25.2 and 27.1. These are the only 2 version I have checked

OS: Running on GKE cos-113-18244-151-88

b0g3r commented 1 week ago

It is not clear from the issue how do you provision nodeslist to the nodes. Could you add more details about it?

MattySimpLush commented 1 week ago

Hi @b0g3r

Yes, so it is deployed as a configmap. For the 3 node setup it looked like below

apiVersion: v1
data:
  nodes: |
    typesense-0.ts.typesense-search.svc.cluster.local:8107:8108,
    typesense-1.ts.typesense-search.svc.cluster.local:8107:8108,
    typesense-2.ts.typesense-search.svc.cluster.local:8107:8108
kind: ConfigMap
metadata:
  name: nodeslist
  namespace: typesense-search