Closed estebangarcia closed 7 years ago
It looks like there are failures related to the DataSketches stats library :
java.lang.ArrayIndexOutOfBoundsException: 256
at com.yahoo.sketches.quantiles.DoublesUpdateImpl.zipSize2KBuffer(DoublesUpdateImpl.java:127)
at com.yahoo.sketches.quantiles.DoublesUpdateImpl.inPlacePropagateCarry(DoublesUpdateImpl.java:92)
at com.yahoo.sketches.quantiles.DoublesUpdateImpl.processFullBaseBuffer(DoublesUpdateImpl.java:46)
at com.yahoo.sketches.quantiles.HeapDoublesSketch.update(HeapDoublesSketch.java:176)
at org.apache.bokkeeper.stats.datasketches.DataSketchesOpStatsLogger.registerSuccessfulEvent(DataSketchesOpStatsLogger.java:59)
at org.apache.bookkeeper.bookie.Journal.run(Journal.java:895)
This exception is happening in the Journal thread and causes the bookie process to restart.
Other exception during the stats collection :
2017-02-26 17:59:56,096 - WARN - [metrics-1-1:DataSketchesMetricsProvider@76] - Failed to report stats: 128
java.lang.ArrayIndexOutOfBoundsException: 128
at com.yahoo.sketches.quantiles.DoublesAuxiliary.populateFromQuantilesSketch(DoublesAuxiliary.java:99)
at com.yahoo.sketches.quantiles.DoublesAuxiliary.<init>(DoublesAuxiliary.java:38)
at com.yahoo.sketches.quantiles.DoublesSketch.constructAuxiliary(DoublesSketch.java:607)
at com.yahoo.sketches.quantiles.DoublesSketch.getQuantile(DoublesSketch.java:195)
at org.apache.bokkeeper.stats.datasketches.DataSketchesOpStatsLogger.getMedian(DataSketchesOpStatsLogger.java:121)
at org.apache.bokkeeper.stats.datasketches.JsonFileReporter.lambda$report$9(JsonFileReporter.java:67)
at java.util.concurrent.ConcurrentSkipListMap.forEach(ConcurrentSkipListMap.java:3252)
at org.apache.bokkeeper.stats.datasketches.JsonFileReporter.report(JsonFileReporter.java:61)
at org.apache.bokkeeper.stats.datasketches.DataSketchesMetricsProvider.lambda$null$5(DataSketchesMetricsProvider.java:74)
As a workaround, you can fallback a different stats implementation for the bookies, eg:
statsProviderClass=org.apache.bookkeeper.stats.CodahaleMetricsProvider
codahaleStatsJmxEndpoint=metrics
and collect the stats through JMX. Or comment statsProviderClass
to disable stats
I don't understand the specific exceptions. Need to dig a bit into that code or ask help to DataSketches people.
Thanks for your response. We'll disable the stats for now.
There seems to be a concurrency issue in the stats provider that makes 2 threads seeing inconsistent internal state and thus throw exception. I'll send a fix in the bookkeeper repo branch.
Hi. We had a couple of bookkeepers that crashed unexpectedly at different moments. We gathered the logs before the crash, I'm attaching them. Any help will be much appreciated.
Thanks bklogs.txt