Azure-Samples / azure-sql-db-change-stream-debezium

SQL Server Change Stream sample using Debezium
MIT License
97 stars 33 forks source link

Debezium connector failing randomly with error "history topic or its content is fully or partially missing" #5

Open Nicodox opened 2 years ago

Nicodox commented 2 years ago

Hi all,

we've implemented Debezium on Azure AKS cluster, connected it to Azure Event Hub, capturing data changes on Azure SQL Server PaaS. Everything works fine but we randomly get some errors and need to recreate a new connector. Following more details:

we've deployed a CDC Debezium deployment on cluster AKS following this guide.

The debezium deployment is capturing CDC events on SQLServer PaaS on Azure and transferring them as events in the event hub.

Debezium connector is working as expected, publishing CDC events to the event hub, which are consumed by other tools; however, after some time Debezium returns the following error:

ERROR || WorkerSourceTask{id=wwi4-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted [org.apache.kafka.connect.runtime.WorkerTask] io.debezium.DebeziumException: The db history topic or its content is fully or partially missing. Please check database history topic configuration and re-execute the snapshot.

When we have this error, the CDC is not working. The only workaround we found out is to create a new Debezium connector, but ends with the same error after some time. It can be a few hours or a few days between the creation of the connector and this error.

This is the Connector:

{ "snapshot.mode": "schema_only", "connector.class": "io.debezium.connector.sqlserver.SqlServerConnector", "database.hostname": "XXXXXXXXX", "database.port": "1433", "database.user": "XXXXX", "database.password": "XXXXXX", "database.dbname": "XXXXX", "database.server.name": "SQLAzure", "tasks.max": "1", "decimal.handling.mode": "string", "table.include.list": "XXXXXX,YYYYY,ZZZZZ", "transforms": "Reroute", "transforms.Reroute.type": "io.debezium.transforms.ByLogicalTableRouter", "transforms.Reroute.topic.regex": "(.*)", "transforms.Reroute.topic.replacement": "wwi", "tombstones.on.delete": false, "database.history": "io.debezium.relational.history.MemoryDatabaseHistory" }

I think the problem is related to the parameter and value: "database.history": "io.debezium.relational.history.MemoryDatabaseHistory"

From the Debezium documentation (https://debezium.io/documentation/reference/stable/operations/debezium-server.html#debezium-source-database-history-class) we read that: io.debezium.relational.history.MemoryDatabaseHistory is a "volatile store for test environments".

So my questions are:

1) Can one avoid the error of History topic fully or partially missing by maintaining the value io.debezium.relational.history.MemoryDatabaseHistory and how?

2) Can one go in production environment with by maintaining the value io.debezium.relational.history.MemoryDatabaseHistory, even if Debezium documentation tells us that the parameter is a "volatile store for test environments".

Can you please help us solving this issue?

Thank you,

kind regards from Italy,

Nicolò

thepaulmacca commented 2 years ago

@Nicodox I would love to see how you've configured this on AKS. Do you have a blog post or anything that you could share please?

yorek commented 2 years ago

Hi @Nicodox and sorry for not answering before, the issue completely went under the radar. Thanks @thepaulmacca for bringing this up again :) The suggestion would be to use the FileDatabaseHistory or RedisDatabaseHistory as mentioned in the documentation you also linked. This in case EventHub doesn't support automatic topic creation for database history yet (it's been a while I tested it, maybe you can check again if now the issue has been resolved: https://github.com/Azure/azure-event-hubs-for-kafka/issues/61), otherwise the default KafkaDatabaseHistory should be the preferred option, AFAIK.

thepaulmacca commented 2 years ago

@yorek do you know of an example on using this with AKS or anything?

Between the debezium docs and this repo, I'm struggling to get my head around an end-to-end configuration (I am quite new to debezium though!)

If you know of any useful blog posts or anything, that would be helpful

Thanks

yorek commented 2 years ago

@thepaulmacca unfortunately not. I'm not an AKS expert and I've just used Debezium with "plain" containers (Docker and Azure Container Instances)...ping me on Twitter (https://twitter.com/mauridb) so that I can bring Gunnar - the dev lead for Debezium - into the discussion. I'm sure he can give more help than what I can do :)

thepaulmacca commented 2 years ago

That would be great, thanks!

poikjo commented 1 week ago

Anything new on this topic?

We had Debezium 1.8 version running almost year and half without issues using history configs like: "database.history": "io.debezium.relational.history.FileDatabaseHistory", "database.history.file.filename": "history.dat" Now after update to 2.7 and Event hub schema history we started seeing "The db history topic is missing" exception and the only solution is apparently to recreate the connection.

Should schema history event hub work fine? How about that cleanup policy? Is there mismatch on documentation as SQL Server connector configuration still mentions MemoryDatabaseHistory usage?