jfindley / newrelic_exporter

NewRelic exporter for prometheus
BSD 2-Clause "Simplified" License
29 stars 20 forks source link

Querying http://exporter_ip:9126/metrics does not show any new relic metrics #4

Closed mkristiansen closed 8 years ago

mkristiansen commented 8 years ago

Steps:

Download and install Go (go1.4.2.darwin-amd64-osx10.8.pkg) from https://golang.org/dl/ git clone https://github.com/jfindley/newrelic_exporter.git (notice I tool master branch to get logging) cd newrelic_exporter make ./newrelic_exporter -api.key=xxxxxxxxxxxxxxx -log.level=debug

Supporting data

Log output when I query http://exporter_ip:9126/metrics with curl or in a browser:

INFO[0000] Listening on :9126.                           file=newrelic_exporter.go line=490
DEBU[0008] Starting new scrape at 1453143110819899063.   file=newrelic_exporter.go line=280
DEBU[0008] Requesting application list from https://api.newrelic.com.  file=newrelic_exporter.go line=63
DEBU[0008] Making API call: https://api.newrelic.com/v2/applications.json  file=newrelic_exporter.go line=397
^C%

Browser / curl output:

HELP go_gc_duration_seconds A summary of the GC invocation durations.
TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 5.4335e-05
go_gc_duration_seconds{quantile="0.25"} 6.7023e-05
go_gc_duration_seconds{quantile="0.5"} 0.00010123300000000001
go_gc_duration_seconds{quantile="0.75"} 0.000132401
go_gc_duration_seconds{quantile="1"} 0.000157467
go_gc_duration_seconds_sum 0.0005981490000000001
go_gc_duration_seconds_count 6
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 9
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 460760
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 521528
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 1.440192e+06
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 833
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 71705
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
# TYPE go_memstats_heap_alloc_bytes gauge
go_memstats_heap_alloc_bytes 460760
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
# TYPE go_memstats_heap_idle_bytes gauge
go_memstats_heap_idle_bytes 81920
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
# TYPE go_memstats_heap_inuse_bytes gauge
go_memstats_heap_inuse_bytes 770048
# HELP go_memstats_heap_objects Number of allocated objects.
# TYPE go_memstats_heap_objects gauge
go_memstats_heap_objects 931
# HELP go_memstats_heap_released_bytes_total Total number of heap bytes released to OS.
# TYPE go_memstats_heap_released_bytes_total counter
go_memstats_heap_released_bytes_total 0
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 851968
# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
# TYPE go_memstats_last_gc_time_seconds gauge
go_memstats_last_gc_time_seconds 1.4531431019981072e+19
# HELP go_memstats_lookups_total Total number of pointer lookups.
# TYPE go_memstats_lookups_total counter
go_memstats_lookups_total 5
# HELP go_memstats_mallocs_total Total number of mallocs.
# TYPE go_memstats_mallocs_total counter
go_memstats_mallocs_total 1764
# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
# TYPE go_memstats_mcache_inuse_bytes gauge
go_memstats_mcache_inuse_bytes 1200
# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
# TYPE go_memstats_mcache_sys_bytes gauge
go_memstats_mcache_sys_bytes 16384
# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
# TYPE go_memstats_mspan_inuse_bytes gauge
go_memstats_mspan_inuse_bytes 6552
# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
# TYPE go_memstats_mspan_sys_bytes gauge
go_memstats_mspan_sys_bytes 16384
# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
# TYPE go_memstats_next_gc_bytes gauge
go_memstats_next_gc_bytes 676848
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
# TYPE go_memstats_other_sys_bytes gauge
go_memstats_other_sys_bytes 292639
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 196608
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 196608
# HELP go_memstats_sys_bytes Number of bytes obtained by system. Sum of all system allocations.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 2.88588e+06
# HELP http_request_duration_microseconds The HTTP request latencies in microseconds.
# TYPE http_request_duration_microseconds summary
http_request_duration_microseconds{handler="prometheus",quantile="0.5"} NaN
http_request_duration_microseconds{handler="prometheus",quantile="0.9"} NaN
http_request_duration_microseconds{handler="prometheus",quantile="0.99"} NaN
http_request_duration_microseconds_sum{handler="prometheus"} 0
http_request_duration_microseconds_count{handler="prometheus"} 0
# HELP http_request_size_bytes The HTTP request sizes in bytes.
# TYPE http_request_size_bytes summary
http_request_size_bytes{handler="prometheus",quantile="0.5"} NaN
http_request_size_bytes{handler="prometheus",quantile="0.9"} NaN
http_request_size_bytes{handler="prometheus",quantile="0.99"} NaN
http_request_size_bytes_sum{handler="prometheus"} 0
http_request_size_bytes_count{handler="prometheus"} 0
# HELP http_response_size_bytes The HTTP response sizes in bytes.
# TYPE http_response_size_bytes summary
http_response_size_bytes{handler="prometheus",quantile="0.5"} NaN
http_response_size_bytes{handler="prometheus",quantile="0.9"} NaN
http_response_size_bytes{handler="prometheus",quantile="0.99"} NaN
http_response_size_bytes_sum{handler="prometheus"} 0
http_response_size_bytes_count{handler="prometheus"} 0
# HELP newrelic_exporter_last_scrape_duration_seconds The last scrape duration.
# TYPE newrelic_exporter_last_scrape_duration_seconds gauge
newrelic_exporter_last_scrape_duration_seconds 3.564219062
# HELP newrelic_exporter_last_scrape_error The last scrape error status.
# TYPE newrelic_exporter_last_scrape_error gauge
newrelic_exporter_last_scrape_error 1
# HELP newrelic_exporter_scrapes_total Total scraped metrics
# TYPE newrelic_exporter_scrapes_total counter
newrelic_exporter_scrapes_total 1>

Go version

go version go1.4.2 darwin/amd64

Can you assist?

jfindley commented 8 years ago

Hi,

Do you have a pro or higher level NewRelic subscription?

Can you include the output of:

curl -X GET 'https://api.newrelic.com/v2/applications.json' \
     -H 'X-Api-Key:xxxxxxxxxx' -i 

curl -X GET 'https://api.newrelic.com/v2/applications/{application_id}/metrics.json' \
     -H 'X-Api-Key:xxxxxxxxxx' -i 

(where application_id is one of the id fields in the first response)

Obviously it'd be useful to have as much data as possible, but you might want to filter out anything sensitive in there.

Thanks!

mkristiansen commented 8 years ago

Thank you for looking into this!

I collected the following data

All 4 files are here

It is my employers NR account - subscriptions included: Web Enterprise Annual Mobile Enterprise Trial Insights Pro Annual Browser Pro Annual Synthetics Pro Annual

jfindley commented 8 years ago

Hi,

I haven't yet been able to reproduce the issue. However, I've just merged a change to the api fetching code that fixes a significant memory leak, and improves performance somewhat too. It's possible that it was failing previously as you have considerably more applications than I'm able to test with, and if so, this might fix it.

Would you mind checking if the latest version ( f2545ce ) fixes it?

mkristiansen commented 8 years ago

At a cursory look (cloned repo afresh, make, run, use browser to access /metrics) I get the same result. At least the output is the same as seen on previous attempts.

jfindley commented 8 years ago

Really sorry this has taken so long - I've been swamped with other things. I've created a branch: https://github.com/jfindley/newrelic_exporter/tree/issue-4 which I think should fix the issue - can you please check it out and let me know?

If it doesn't fix the problem, can you possibly give me the debug logs from the new version?

Thanks,

James

mkristiansen commented 8 years ago

It took over 45 minutes to scrape the metrics once by it did complete. Thanks for your support!

Are you amenable to help establish a filter/ parameter to select which application(s) are scraped?

jfindley commented 8 years ago

45 minutes! ouch! I probably should parallelise the retrieval, although this would come at the cost of increased memory usage, 45 minutes is clearly unreasonable.

A filter is also a reasonable idea, though it'd require some thought to implement, because it'd be good to be able to filter on both application IDs and metric types. It probably means introducing a config file (which could store the API key too).

api_key: zxy

1234:
  - all

4567:
  - metric1
  - metric2

Might be reasonable. I'll have a think.

jfindley commented 8 years ago

I've created https://github.com/jfindley/newrelic_exporter/tree/parallel-reqs which should improve the speed of the retrieval.

My testing shows that it's a bit over 3 times faster than it was, but that's on my (relatively) small data set. I don't sadly have access to a data set anything like as huge as yours, but I'd hope that the gains were bigger in your case. I'd be interested to see what difference it made, if you have a few mins to test?

I'll have a closer look at the filtering soon.

mkristiansen commented 8 years ago

Great progress!! Ran the new branch against the metrics from one of our test systems

curl http://localhost:9126/metrics > NRMetrics.txt % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 73.7M 100 73.7M 0 0 688k 0 0:01:49 0:01:49 --:--:-- 18.0M

curl http://localhost:9126/metrics > NRMetrics.txt % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 73.5M 100 73.5M 0 0 527k 0 0:02:22 0:02:22 --:--:-- 18.1M

curl http://localhost:9126/metrics > NRMetrics.txt % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 73.5M 100 73.5M 0 0 529k 0 0:02:22 0:02:22 --:--:-- 16.6M

curl http://localhost:9126/metrics > NRMetrics.txt % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 73.6M 100 73.6M 0 0 738k 0 0:01:42 0:01:42 --:--:-- 20.7M

jfindley commented 8 years ago

As we haven't run into any stability issues I've merged the parallel-requests branch. Given that it now completes in < 5 mins, do you still want the ability to filter out applications, or can I close this?

mkristiansen commented 8 years ago

Ok to close. If we require additional support once we have done more explorations I'll open a new issue. Thanks for your support.

If you need anything tested on a large application/metric set in the future please do not hesitate to reach out.