Closed breedx-nr closed 3 years ago
Reviewed and determined to be permanently deferred.
Hi @kford-newrelic. Does this mean Solr Cache monitoring will no longer be supported going forward?
@rahmnathan I had to create a custom NR extension for SolrCloud caches and updates.
Create a file called newrelic-solr-extension.yml
in /opt/newrelic/extensions/
This is what I had in my file (not sure if it can be paired down at all) and they are all queryable via the Metrics table in NRQL:
name: SolrCloudCustom
version: 1.0
enabled: true
jmx:
- object_name: solr:dom1=core,dom2=*,dom3=*,dom4=*,category=CACHE,scope=searcher,name=*
metrics:
- attributes: inserts, hits, size, ramBytesUsed, lookups, hitratio, evictions, warmupTime
type: simple
- object_name: solr:dom1=core,dom2=*,dom3=*,dom4=*,category=UPDATE,scope=updateHandler,name=*
metrics:
- attributes: Value
type: simple
- object_name: solr:dom1=node,category=UPDATE,scope=updateShardHandler,name=*
metrics:
- attributes: Count, Max, Mean, Min, StdDev, MeanRate, 50thPercentile, 95thPercentile, 98thPercentile, 99thPercentile, 999thPercentile, OneMinuteRate, FiveMinuteRate, FifteenMinuteRate
type: simple
- object_name: solr:dom1=core,dom2=*,dom3=*,dom4=*,category=INDEX,name=sizeInBytes
metrics:
- attributes: Value
type: simple
- object_name: solr:dom1=core,dom2=*,dom3=*,dom4=*,category=QUERY,scope=/select,name=*
metrics:
- attributes: Count, Max, Mean, Min, StdDev, MeanRate, 50thPercentile, 95thPercentile, 98thPercentile, 99thPercentile, 999thPercentile, OneMinuteRate, FiveMinuteRate, FifteenMinuteRate
type: simple
- object_name: solr:dom1=core,dom2=*,dom3=*,dom4=*,category=QUERY,scope=/get,name=*
metrics:
- attributes: Count, Max, Mean, Min, StdDev, MeanRate, 50thPercentile, 95thPercentile, 98thPercentile, 99thPercentile, 999thPercentile, OneMinuteRate, FiveMinuteRate, FifteenMinuteRate
type: simple
- object_name: solr:dom1=core,dom2=*,dom3=*,dom4=*,category=SEARCHER,scope=*,name=*
metrics:
- attributes: Count, Value
type: simple
Thanks @mmulligan03. Did this result in the 'Solr Caches' page being populated in NewRelic? Otherwise, could you share a query you're looking at to inspect this data?
I got this config file in place, but I haven't spent time with NewRelic's query language.
It doesn't fix the Solr Caches or Update page in APM but you can query all the metrics collected in the Metrics table
Like so:
SELECT average(newrelic.timeslice.value) FROM Metric WHERE appName = 'YOUR_SOLR_APP_NAME' AND newrelic.timeslice.value IS NOT NULL WITH METRIC_FORMAT 'JMX/solr/null/{collection}/{shard}/{replica}/CACHE/searcher/{cacheName}/core/hitratio' facet collection, cacheName SINCE 1 hour ago timeseries MAX
I reference this a lot when working with Metrics https://docs.newrelic.com/docs/data-apis/understand-data/metric-data/query-apm-metric-timeslice-data-nrql
Something like this will tell you all the Solr Metrics you have now
SELECT uniques(metricTimesliceName) FROM Metric WHERE appName like 'YOUR_SOLR_APP_NAME' AND newrelic.timeslice.value IS NOT NULL and metricTimesliceName like 'JMX/solr/%'
Not sure why they haven't been able to fix their agent to use the alternate format when running in SolrCloud but this worked for us.
@mmulligan03 Really appreciate this! Using your configuration + query I've been able to get this stuff visualizing again, though obviously not as convenient as the built-in functionality that worked previously.
We're considering Graphite and/or Prometheus as Solr is supposed to support those tools as well, but you've been immensely helpful (and prompt!) getting me past this issue.
I struggled for a bit to figure out how to get it working so I'm glad I could spare you that!
@rahmnathan At the moment, we have a lot we want to accomplish for our agent roadmap and when we drew the line, Solr 8 didn't make the cut. That doesn't necessarily mean it's a "forever" thing, just for the current roadmap. Of course, if there's an enterprising engineer that wants to start with our existing instrumentation and create a PR with an update, that would be awesome!
@mmulligan03 really like your approach to crafting a custom instrumentation solution - we hope that others interested in Solr 8 can benefit from your hard work!
TL;DR
When Solr8 is using cloud sharding, some of the JMX metrics are not reporting. It would be great if they would!
About
If we look in the Solr7JmxValues.java class we can see that the `updateHandler' is looking for beanName:
but when Solr is configured for sharding, the names are dynamically generated to an arbitrary depth, and may look more like this:
For caching, the NR Solr 7 jmx support is looking for
but Solr8 might look more like
When the New Relic JMX component cannot find these beans, the data does not get reported and ends up showing up as zeros in the UI.
Feature Description
The agent should be enhanced to be able to find the sharded JMX beans. Rather than hard-coding a handful of fixed names, the agent should adapt to solr 8 and be able to list or otherwise enumerate the bean names and match the arbitrary-depth domain names, as shown above.
These beans should be queried/monitored by the agent and reported to New Relic for display in the NR1 UI.
Describe Alternatives
Solr uses dropwizard/codahale metrics internally, and so the New Relic dropwizard reporter might be able to be used to get the same telemetry. Some experimentation/exploration would be required to verify that the same information can be obtained...and also how it might map to a cohesive user experience.
Additional context
It is unknown what earliest version of Solr supports cloud/shard bean names with arbitrary domain depth. It is likely that this will continue in future versions of dropwizard.
Priority
"Really Want". More than one customer has asked for this support.