A while back, Wikimedia stream (recentchange at least) added some "canary" records to help with heartbeat/keepalive. Helper libraries were updated to automatically filter these out. They can be identifed by meta.domain == 'canary'. See https://phabricator.wikimedia.org/T266798
These canary records are thin - specifically they don't include a data bot boolean field which is present on all other records. This is resulting in a NullPointerException when attempting to filter on the bot field as a boolean:
[wikipedia-activity-monitor-StreamThread-1] ERROR org.apache.kafka.streams.processor.internals.TaskExecutor - stream-thread [wikipedia-activity-monitor-StreamThread-1] Failed to
process stream task 0_1 due to the following error:
org.apache.kafka.streams.errors.StreamsException: Exception caught in process. taskId=0_1, processor=KSTREAM-SOURCE-0000000000, topic=wikipedia.parsed, partition=1, offset=1139, stacktrace=java.lang.NullPointerException
at io.confluent.demos.common.wiki.WikipediaActivityMonitor.lambda$createMonitorStream$1(WikipediaActivityMonitor.java:119)
at org.apache.kafka.streams.kstream.internals.KStreamFilter$KStreamFilterProcessor.process(KStreamFilter.java:43)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:159)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forwardInternal(ProcessorContextImpl.java:290)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:269)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:228)
at org.apache.kafka.streams.kstream.internals.KStreamMap$KStreamMapProcessor.process(KStreamMap.java:48)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:157)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forwardInternal(ProcessorContextImpl.java:290)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:269)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:228)
at org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:84)
at org.apache.kafka.streams.processor.internals.StreamTask.lambda$doProcess$1(StreamTask.java:793)
at org.apache.kafka.streams.processor.internals.metrics.StreamsMetricsImpl.maybeMeasureLatency(StreamsMetricsImpl.java:872)
at org.apache.kafka.streams.processor.internals.StreamTask.doProcess(StreamTask.java:793)
at org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:724)
at org.apache.kafka.streams.processor.internals.TaskExecutor.processTask(TaskExecutor.java:100)
at org.apache.kafka.streams.processor.internals.TaskExecutor.process(TaskExecutor.java:81)
at org.apache.kafka.streams.processor.internals.TaskManager.process(TaskManager.java:1182)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:768)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:588)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:550)
at org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:750)
at org.apache.kafka.streams.processor.internals.TaskExecutor.processTask(TaskExecutor.java:100)
at org.apache.kafka.streams.processor.internals.TaskExecutor.process(TaskExecutor.java:81)
at org.apache.kafka.streams.processor.internals.TaskManager.process(TaskManager.java:1182)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:768)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:588)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:550)
Caused by: java.lang.NullPointerException
at io.confluent.demos.common.wiki.WikipediaActivityMonitor.lambda$createMonitorStream$1(WikipediaActivityMonitor.java:119)
at org.apache.kafka.streams.kstream.internals.KStreamFilter$KStreamFilterProcessor.process(KStreamFilter.java:43)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:159)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forwardInternal(ProcessorContextImpl.java:290)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:269)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:228)
at org.apache.kafka.streams.kstream.internals.KStreamMap$KStreamMapProcessor.process(KStreamMap.java:48)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:157)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forwardInternal(ProcessorContextImpl.java:290)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:269)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:228)
at org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:84)
at org.apache.kafka.streams.processor.internals.StreamTask.lambda$doProcess$1(StreamTask.java:793)
at org.apache.kafka.streams.processor.internals.metrics.StreamsMetricsImpl.maybeMeasureLatency(StreamsMetricsImpl.java:872)
at org.apache.kafka.streams.processor.internals.StreamTask.doProcess(StreamTask.java:793)
at org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:724)
... 6 more
[wikipedia-activity-monitor-StreamThread-1] ERROR org.apache.kafka.streams.KafkaStreams - stream-client [wikipedia-activity-monitor] Encountered the following exception during pr
ocessing and the registered exception handler opted to SHUTDOWN_CLIENT. The streams client is going to shut down now.
Here's a sample canary record as parsed-in by our connector:
Description
A while back, Wikimedia stream (recentchange at least) added some "canary" records to help with heartbeat/keepalive. Helper libraries were updated to automatically filter these out. They can be identifed by
meta.domain == 'canary'
. See https://phabricator.wikimedia.org/T266798These canary records are thin - specifically they don't include a data
bot
boolean field which is present on all other records. This is resulting in aNullPointerException
when attempting to filter on the bot field as aboolean
:Here's a sample canary record as parsed-in by our connector:
You can also observe these canary records with: