While the job works well ordinarily, occasionally, the streaming stops all of a sudden and the below appears in a loop in the log4j output. Restarting the job processes all the data in the 'backlog'. What could be causing this?
22/02/27 00:57:58 INFO HiveMetaStore: 1: get_database: default
22/02/27 00:57:58 INFO audit: ugi=root ip=unknown-ip-addr cmd=get_database: default
22/02/27 00:57:58 INFO DriverCorral: Metastore health check ok
22/02/27 00:58:07 INFO HikariDataSource: metastore-monitor - Starting...
22/02/27 00:58:07 INFO HikariDataSource: metastore-monitor - Start completed.
22/02/27 00:58:07 INFO HikariDataSource: metastore-monitor - Shutdown initiated...
22/02/27 00:58:07 INFO HikariDataSource: metastore-monitor - Shutdown completed.
22/02/27 00:58:07 INFO MetastoreMonitor: Metastore healthcheck successful (connection duration = 88 milliseconds)
22/02/27 00:58:50 INFO RxDocumentClientImpl: Getting database account endpoint from https://<cosmosdb_endpoint>.documents.azure.com:443
I have a Spark streaming job which reads Cosmos Changefeed data as below, running in a Databricks cluster with DBR 8.2.
While the job works well ordinarily, occasionally, the streaming stops all of a sudden and the below appears in a loop in the log4j output. Restarting the job processes all the data in the 'backlog'. What could be causing this?