housepower / clickhouse_sinker

Easily load data from kafka to ClickHouse
https://housepower.github.io/clickhouse_sinker
Apache License 2.0
515 stars 118 forks source link

Processing one partition stopped #134

Closed josepowera closed 2 years ago

josepowera commented 2 years ago

We saw that one partition stopped processing... Here are logs, with no explanation / panic /... in logs..

After restart we got both partitions processed again...

Version: master - version v2.1.0-2


    {"level":"info","ts":"2021-11-10T19:53:06.803Z","msg":"topic ssprtbwinsraw partition 1 became idle","task":"csv_ssp_ssprtbwins_raw"}
{"level":"info","ts":"2021-11-10T19:53:07.608Z","msg":"topic ssprtbwinsraw partition 1 quit idle","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:08.101Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1815593591,1815609344), messages 15753, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:08.101Z","msg":"got a message(topic ssprtbwinsraw, partition 0, offset 1815626359) while the ring is full, wake-up","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:09.311Z","msg":"got a message(topic ssprtbwinsraw, partition 1, offset 1815592960) while the ring is full, waiting...","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:11.202Z","msg":"got a message(topic ssprtbwinsraw, partition 0, offset 1815642112) while the ring is full, waiting...","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:11.302Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1815609344,1815625728), messages 16384, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:13.301Z","msg":"got a message(topic ssprtbwinsraw, partition 0, offset 1815642112) while the ring is full, wake-up","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:17.000Z","msg":"got a message(topic ssprtbwinsraw, partition 0, offset 1815658496) while the ring is full, waiting...","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:18.000Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1815625728,1815630921), messages 5193, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:18.100Z","msg":"got a message(topic ssprtbwinsraw, partition 0, offset 1815658496) while the ring is full, wake-up","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:19.404Z","msg":"got a message(topic ssprtbwinsraw, partition 0, offset 1815663689) while the ring is full, waiting...","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:20.100Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1815630921,1815642112), messages 11191, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:21.096Z","msg":"got a message(topic ssprtbwinsraw, partition 0, offset 1815663689) while the ring is full, wake-up","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:23.208Z","msg":"got a message(topic ssprtbwinsraw, partition 0, offset 1815674880) while the ring is full, waiting...","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:23.703Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1815642112,1815658496), messages 16384, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:27.720Z","msg":"got a message(topic ssprtbwinsraw, partition 0, offset 1815674880) while the ring is full, wake-up","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:27.721Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1815658496,1815658985), messages 489, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:32.901Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1815658985,1815659443), messages 458, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:39.005Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1815659443,1815674880), messages 15437, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:41.901Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1815674880,1815676472), messages 1592, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T19:53:48.901Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1815676472,1815691264), messages 14792, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
............
............
{"level":"debug","ts":"2021-11-10T20:05:19.901Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1817067520,1817068727), messages 1207, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T20:05:25.101Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1817068727,1817083289), messages 14562, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T20:05:27.841Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1817083289,1817083904), messages 615, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T20:05:32.501Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1817083904,1817100288), messages 16384, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T20:05:37.673Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1817100288,1817100917), messages 629, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T20:05:42.301Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1817100917,1817116672), messages 15755, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T20:05:47.494Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1817116672,1817118569), messages 1897, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T20:05:50.901Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1817118569,1817133056), messages 14487, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T20:05:56.439Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1817133056,1817133525), messages 469, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}

then we restarted program to see if a problem reappeared... but everything was normal again.

{"level":"info","ts":"2021-11-10T20:07:19.561Z","msg":"version v2.1.0-2-gd4578fa-dirty, commit d4578fa37ccd635cff7706ea018d90c414cb4f29, date 2021-11-10T19:31:30+00:00, builtBy root@80f8c4cdfedf"}
{"level":"info","ts":"2021-11-10T20:07:19.562Z","msg":"clickhouse_sinker initialization"}
{"level":"info","ts":"2021-11-10T20:07:19.563Z","msg":"Run http server at http://10.42.4.242:2306/"}
{"level":"info","ts":"2021-11-10T20:07:19.563Z","msg":"get config from local file /etc/clickhouse_sinker/tasks/ssp_ssprtbwins_raw.json"}
{"level":"info","ts":"2021-11-10T20:07:19.563Z","msg":"clickhouse_sinker initialization completed"}
{"level":"info","ts":"2021-11-10T20:07:19.564Z","msg":"going to apply the first config","config":{"Kafka":{"Brokers":"kafkaoss-kafka-bootstrap.kafka.svc.cluster.local:9092","Version":"2.2.1","Security":null,"TLS":{"Enable":false,"CaCertFiles":"","ClientCertFile":"","ClientKeyFile":"","TrustStoreLocation":"","TrustStorePassword":"","KeystoreLocation":"","KeystorePassword":"","EndpIdentAlgo":""},"Sasl":{"Enable":false,"Mechanism":"","Username":"","Password":"","GSSAPI":{"AuthType":0,"KeyTabPath":"","KerberosConfigPath":"","ServiceName":"","Username":"","Password":"","Realm":"","DisablePAFXFAST":false}}},"Clickhouse":{"Cluster":"","DB":"test5ssp","Hosts":[["clickhouse-pv-simple-19000.test-clickhouse-operator.svc.cluster.local"]],"Port":19000,"Username":"","Password":"","DsnParams":"","Secure":false,"InsecureSkipVerify":false,"RetryTimes":0,"MaxOpenConns":1},"Task":null,"Tasks":[{"Name":"csv_ssp_ssprtbwins_raw","KafkaClient":"sarama","Topic":"ssprtbwinsraw","ConsumerGroup":"chrawctb6","Earliest":true,"Parser":"csv","CsvFormat":["tenantid","dspid","dspseatid","adprocesstype","adzoneid","campaignid","creativeid","geocountry3","dealid","currency","requestid","eventtime","auctionid","impressionid","bidid","mediacost","bidprice","quantity","publisherid","siteid","crtype","mediatype"],"Delimiter":",","TableName":"ssprtbwins_raw_raw","AutoSchema":true,"ExcludeColumns":["eventDate","eventTimeStamp"],"dims":null,"DynamicSchema":{"Enable":false,"MaxDims":0,"WhiteList":"","BlackList":""},"PrometheusSchema":false,"flushInterval":5,"bufferSize":16384,"timeZone":"Local","timeUnit":1}],"Assignment":{"Version":0,"UpdatedAt":0,"UpdatedBy":"","Map":null},"LogLevel":"debug"}}
{"level":"info","ts":"2021-11-10T20:07:19.585Z","msg":"sql.Open and sqlDB.Ping succeeded","dbVer":1,"dsn":"tcp://10.43.168.18:19000?database=test5ssp&username=&password=&block_size=2097152"}
{"level":"info","ts":"2021-11-10T20:07:19.585Z","msg":"initialized parsing pool","maxWorkers":3,"queueSize":65536}
{"level":"info","ts":"2021-11-10T20:07:19.585Z","msg":"initialized writing pool","maxWorkers":1,"queueSize":3}
{"level":"info","ts":"2021-11-10T20:07:19.585Z","msg":"task initializing","task":"csv_ssp_ssprtbwins_raw"}
{"level":"info","ts":"2021-11-10T20:07:19.668Z","msg":"Prepare sql=> INSERT INTO test5ssp.ssprtbwins_raw_raw (`tenantid`,`dspid`,`dspseatid`,`adprocesstype`,`adzoneid`,`campaignid`,`creativeid`,`geocountry3`,`dealid`,`currency`,`requestid`,`eventtime`,`auctionid`,`impressionid`,`bidid`,`mediacost`,`bidprice`,`quantity`,`publisherid`,`siteid`,`crtype`,`mediatype`) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)","task":"csv_ssp_ssprtbwins_raw"}
{"level":"info","ts":"2021-11-10T20:07:19.679Z","msg":"applied the first config"}
{"level":"info","ts":"2021-11-10T20:07:23.168Z","msg":"topic ssprtbwinsraw partition 0 quit idle","task":"csv_ssp_ssprtbwins_raw"}
{"level":"info","ts":"2021-11-10T20:07:23.559Z","msg":"topic ssprtbwinsraw partition 1 quit idle","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T20:07:27.753Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1817232569,1817244461), messages 11892, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T20:07:27.853Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 1, offset [1815560192,1815568967), messages 8775, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T20:07:29.652Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 1, offset [1815568967,1815576576), messages 7609, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T20:07:30.651Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 0, offset [1817244461,1817247744), messages 3283, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T20:07:31.655Z","msg":"got a message(topic ssprtbwinsraw, partition 1, offset 1815609344) while the ring is full, waiting...","task":"csv_ssp_ssprtbwins_raw"}
{"level":"debug","ts":"2021-11-10T20:07:33.654Z","msg":"going to flush a batch for topic ssprtbwinsraw patittion 1, offset [1815576576,1815588704), messages 12128, parse errors: 0","task":"csv_ssp_ssprtbwins_raw"}
yuzhichang commented 2 years ago

What's the Kafka broker version? It could be the compatibility issue between sarama and Kafka. You can enable sarama log at kafka_sarama.go:111. You can also check clickhouse_sinker' goroutine stack from built-in http server. The listen port is determined at startup. {"level":"info","ts":"2021-11-10T20:07:19.563Z","msg":"Run http server at http://10.42.4.242:2306/"}

josepowera commented 2 years ago

Now we moved to kafka 2.8 (to stay on sarama compatibility list) and updated to correct version in config-task files. Will wait if it reappears.