Open ggprod opened 2 years ago
I noticed in the logs for the 2 days the template is running correctly it is performing the DetectNewPartitionsDoFn
which executes the DetectNewPartitionsAction
which logs this very frequently (on average once or more every second)... but then it suddenly stops (suggested perhaps thread deadlock).
I also noticed this abrupt stop in the logging of this happens about 1m after another error is logged by the worker related to OpenCensus (though not sure if they are related or it is coincidental)
java.lang.NullPointerException
at io.opencensus.implcore.stats.MeasureToViewMap.record(MeasureToViewMap.java:153)
at io.opencensus.implcore.stats.StatsManager$StatsEvent.process(StatsManager.java:101)
at io.opencensus.impl.internal.DisruptorEventQueue$DisruptorEventHandler.onEvent(DisruptorEventQueue.java:229)
at io.opencensus.impl.internal.DisruptorEventQueue$DisruptorEventHandler.onEvent(DisruptorEventQueue.java:222)
at com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:168)
at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:125)
at java.lang.Thread.run(Thread.java:750)
I believe I incorrectly set the priority though.. could be a P2 or P3 as the template does run correctly for some time and can be restarted when this problem is detected (with a change stream start-time starting in the past so no change data is lost)
This working for several days, then failing, means that it will be hard to investigate. Can you raise a GCP support ticket with this information?
@johnjcasey yes, I agree.. sure, will do
changing issue priority now that this is a GCP support ticket. @ggprod please update this issue with any resolution based on that ticket
@ggprod any root cause for this? I am also seeing the same issue.
@anip-patel-exa I had logged a support issue (https://issuetracker.google.com/issues/244327728) but forgot to follow up and it was closed. You could provide your occurrence and perhaps it would get reeopened
What happened?
Using a new Dataflow template that reads Spanner change streams via the SpannerIO.readChangeStream(). The streaming template was working correctly for 2 days but then it stops forwarding change records and starts continuously throwing errors like below
Issue Priority
Priority: 1
Issue Component
Component: io-java-gcp