wepay / kafka-connect-bigquery

DEPRECATED. PLEASE USE https://github.com/confluentinc/kafka-connect-bigquery. A Kafka Connect BigQuery sink connector
Apache License 2.0
155 stars 192 forks source link

Offset commit failed on confluentinc/cp-kafka:4.1.0 #158

Open geopamplona opened 5 years ago

geopamplona commented 5 years ago

I have a connector configured against bigquery ( i am using kafka confluentinc/cp-kafka:4.1.0 in distributed mode with 3 brokers), it send data from a topic that does not have a data flow, but rather 100 static records.

As a result, the connector create the table and fill it in bigquery, however, it continues to send records. In bigquery I can see how it fills the streaming buffer, and generate CPU usage. It's like connector can not confirm the sending of messages in kafka to bigquery or similar.

17/5/2019 8:49:29[2019-05-17 06:49:29,009] ERROR WorkerSinkTask{id=......-v5-0} Offset commit failed, rewinding to last committed offsets (org.apache.kafka.connect.runtime.WorkerSinkTask)
17/5/2019 8:49:29java.lang.ClassCastException: org.apache.kafka.clients.consumer.OffsetAndMetadata cannot be cast to org.apache.kafka.clients.consumer.OffsetAndMetadata
17/5/2019 8:49:29   at com.wepay.kafka.connect.bigquery.BigQuerySinkTask.updateOffsets(BigQuerySinkTask.java:137)
17/5/2019 8:49:29   at com.wepay.kafka.connect.bigquery.BigQuerySinkTask.flush(BigQuerySinkTask.java:126)
17/5/2019 8:49:29   at org.apache.kafka.connect.sink.SinkTask.preCommit(SinkTask.java:117)
17/5/2019 8:49:29   at org.apache.kafka.connect.runtime.WorkerSinkTask.commitOffsets(WorkerSinkTask.java:345)
17/5/2019 8:49:29   at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:182)
17/5/2019 8:49:29   at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:166)
17/5/2019 8:49:29   at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)
17/5/2019 8:49:29   at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)
17/5/2019 8:49:29   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
17/5/2019 8:49:29   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
17/5/2019 8:49:29   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
17/5/2019 8:49:29   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
17/5/2019 8:49:29   at java.lang.Thread.run(Thread.java:745)

my config connector

connector.class=com.wepay.kafka.connect.bigquery.BigQuerySinkConnector
autoUpdateSchemas=true
sanitizeTopics=true
autoCreateTables=true
tasks.max=1
topics=TOPIC
schemaRegistryLocation=http://schema-registry:28081
topicsToTables=TOPIC=TABLE
project=big-query-test
datasets=.*=DATASET
keyfile=/etc/kafka-connect/certs/big-query.json
schemaRetriever=com.wepay.kafka.connect.bigquery.schemaregistry.schemaretriever.SchemaRegistrySchemaRetriever

edit: the connector does not appear as a consumer of the topic either

edit2: it seems that the connector can not confirm the offset it has consumed .. what version of kafka connect is the connector made of?

geopamplona commented 5 years ago

any opinion? or idea?