banzaicloud / spark-metrics

Spark metrics related custom classes and sinks (e.g. Prometheus)
Apache License 2.0
176 stars 66 forks source link

unable to send metrics to Prometheus Push Gateway #40

Closed vijayrajah closed 5 years ago

vijayrajah commented 5 years ago

Describe the bug Unable to push the data to Push Gateway

Here is the error stack

19/09/19 08:12:17 ERROR DatabricksMain$DBUncaughtExceptionHandler: Uncaught exception in thread PrometheusMetricsExporter!
java.lang.NoSuchFieldError: timestampMs
    at com.banzaicloud.spark.metrics.DropwizardExports$$anonfun$collect$1$$anonfun$apply$1.apply(DropwizardExports.scala:39)
    at com.banzaicloud.spark.metrics.DropwizardExports$$anonfun$collect$1$$anonfun$apply$1.apply(DropwizardExports.scala:34)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.Iterator$class.foreach(Iterator.scala:893)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
    at scala.collection.AbstractTraversable.map(Traversable.scala:104)
    at com.banzaicloud.spark.metrics.DropwizardExports$$anonfun$collect$1.apply(DropwizardExports.scala:33)
    at com.banzaicloud.spark.metrics.DropwizardExports$$anonfun$collect$1.apply(DropwizardExports.scala:29)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.Iterator$class.foreach(Iterator.scala:893)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
    at scala.collection.AbstractTraversable.map(Traversable.scala:104)
    at com.banzaicloud.spark.metrics.DropwizardExports.collect(DropwizardExports.scala:28)
    at com.banzaicloud.spark.metrics.DropwizardExportsWithMetricNameCaptureAndReplace.collect(DropwizardExportsWithMetricNameCaptureAndReplace.scala:48)
    at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.findNextElement(CollectorRegistry.java:72)
    at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:87)
    at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:57)
    at java.util.Collections.list(Collections.java:5240)
    at io.prometheus.client.exporter.common.TextFormat.write004(TextFormat.java:17)
    at com.databricks.DatabricksMain.com$databricks$DatabricksMain$$prometheusMetricsString(DatabricksMain.scala:315)
    at com.databricks.DatabricksMain$$anonfun$startPrometheusMetricsExport$1.apply$mcV$sp(DatabricksMain.scala:298)
    at com.databricks.DatabricksMain$$anonfun$startPrometheusMetricsExport$1.apply(DatabricksMain.scala:297)
    at com.databricks.DatabricksMain$$anonfun$startPrometheusMetricsExport$1.apply(DatabricksMain.scala:297)
    at com.databricks.util.UntrustedUtils$.tryLog(UntrustedUtils.scala:98)
    at com.databricks.threading.NamedTimer$$anon$1$$anonfun$run$1.apply(NamedTimer.scala:53)
    at com.databricks.threading.NamedTimer$$anon$1$$anonfun$run$1.apply(NamedTimer.scala:53)
    at com.databricks.logging.UsageLogging$$anonfun$recordOperation$1.apply(UsageLogging.scala:359)
    at com.databricks.logging.UsageLogging$$anonfun$withAttributionContext$1.apply(UsageLogging.scala:235)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
    at com.databricks.logging.UsageLogging$class.withAttributionContext(UsageLogging.scala:230)
    at com.databricks.threading.NamedTimer$$anon$1.withAttributionContext(NamedTimer.scala:50)
    at com.databricks.logging.UsageLogging$class.withAttributionTags(UsageLogging.scala:268)
    at com.databricks.threading.NamedTimer$$anon$1.withAttributionTags(NamedTimer.scala:50)
    at com.databricks.logging.UsageLogging$class.recordOperation(UsageLogging.scala:345)
    at com.databricks.threading.NamedTimer$$anon$1.recordOperation(NamedTimer.scala:50)
    at com.databricks.threading.NamedTimer$$anon$1.run(NamedTimer.scala:52)
    at java.util.TimerThread.mainLoop(Timer.java:555)
    at java.util.TimerThread.run(Timer.java:505)

My Metrics.properties file

# Enable Prometheus for all instances by class name
*.sink.prometheus.class=com.banzaicloud.spark.metrics.sink.PrometheusSink
# Prometheus pushgateway address
*.sink.prometheus.pushgateway-address-protocol=http
*.sink.prometheus.pushgateway-address=<IP>:9091
*.sink.prometheus.period=10
*.sink.prometheus.unit=seconds

# Metrics name processing (version 2.3-1.1.0 +)
#*.sink.prometheus.metrics-name-capture-regex=(.+driver_)(.+)
#*.sink.prometheus.metrics-name-replacement=$2
master.sink.prometheus.metrics-name-capture-regex=(.*)
master.sink.prometheus.metrics-name-replacement=master_$1
worker.sink.prometheus.metrics-name-capture-regex=(.*)
worker.sink.prometheus.metrics-name-replacement=worker_$1
executor.sink.prometheus.metrics-name-capture-regex=(.*)
executor.sink.prometheus.metrics-name-replacement=executor_$1
driver.sink.prometheus.metrics-name-capture-regex=(.*)
driver.sink.prometheus.metrics-name-replacement=driver_$1
applications.sink.prometheus.metrics-name-capture-regex=(.*)
applications.sink.prometheus.metrics-name-replacement=app_$1
*.sink.prometheus.labels=name=vijaytest1,name2=test123
# Support for JMX Collector (version 2.3-2.0.0 +)
#*.sink.prometheus.enable-dropwizard-collector=false
#*.sink.prometheus.enable-jmx-collector=true
#*.sink.prometheus.jmx-collector-config=/some/path/jvm_exporter.yml

# Enable HostName in Instance instead of Appid (Default value is false i.e. instance=${appid})
*.sink.prometheus.enable-hostname-in-instance=true

# Enable JVM metrics source for all instances by class name
*.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink
*.source.jvm.class=org.apache.spark.metrics.source.JvmSource

Steps to reproduce the issue: I copied the Jar from Maven - https://mvnrepository.com/artifact/com.banzaicloud/spark-metrics_2.11/2.3-2.1.0

Expected behavior

Screenshots

Additional context Pushgateway version: 0.9.1 Prometheus version: 2.12.0

Similar issues have been reported https://github.com/census-instrumentation/opencensus-java/issues/1215

stoader commented 5 years ago

@vijayrajah can you check that the appropriate version of the dependent jars is copied to the host as well. You can find here the list of jars: https://github.com/banzaicloud/spark-metrics/blob/master/PrometheusSink.md#how-to-enable-prometheussink-in-spark

io.prometheus:simpleclient_dropwizard:0.3.0
io.prometheus:simpleclient_pushgateway:0.3.0
io.dropwizard.metrics:metrics-core:3.1.2

You can download the dependencies to a temp directory using the following steps:

  1. mvn dependency:get -DgroupId=com.banzaicloud -DartifactId=spark-metrics_2.11 -Dversion=2.3-2.1.0
  2. mkdir temp
  3. mvn dependency:copy-dependencies -f ~/.m2/repository/com/banzaicloud/spark-metrics_2.11/2.3-2.1.0/spark-metrics_2.11-2.3-2.1.0.pom -DoutputDirectory=$(pwd)/temp
vijayrajah commented 5 years ago

@stoader Yes, that fixed it Thanks... Sorry, i could not respond earlier...