Open stibi opened 7 months ago
I think this tells it to allocate 6GB:
envVars:
- name: JAVA_HEAP
value: 6000m
I assume it can do with much less than 6000m
. Try a 10th of that and see how it goes.
Ah … I thought that is its maximum value, not making it that big…that makes sense now. I was confused by another problem, where the exporter was in crashloop all the time, I solved that by tuning the livenes probe a bit… fiddling with heap size was one of the attempts to fix that…
Thanks, I think it will be quite ok with default heap size value, will try that in a moment.
On Tue 21. 11. 2023 at 17:01, Radu Gheorghe @.***> wrote:
I think this tells it to allocate 6GB:
envVars: - name: JAVA_HEAP value: 6000m
I assume it can do with much less than 6000m. Try a 10th of that and see how it goes.
— Reply to this email directly, view it on GitHub https://github.com/apache/solr-operator/issues/658#issuecomment-1821207217, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAOD2SG5PENFVBMCNCJ263YFTF55AVCNFSM6AAAAAA7UPKTP6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRRGIYDOMRRG4 . You are receiving this because you authored the thread.Message ID: @.***>
Ouch, so maybe I wasn't so wrong about it ... I removed the JAVA_HEAP
env var, but the exporter started to failing with java.lang.OutOfMemoryError: Java heap space
. Here we go, full circle :D
So I had to put the JAVA_HEAP
back to see how much java heap space it actually needs and the number is 5G. With that much heap space, the exporter is running without error. But it takes quite some time to collect all the metrics, isn't that weird?
INFO - 2023-11-22 09:53:39.225; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Completed metrics collection
INFO - 2023-11-22 09:54:39.226; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Beginning metrics collection
INFO - 2023-11-22 09:55:15.506; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Completed metrics collection
INFO - 2023-11-22 09:56:15.506; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Beginning metrics collection
INFO - 2023-11-22 09:56:53.088; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Completed metrics collection
INFO - 2023-11-22 09:57:53.088; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Beginning metrics collection
INFO - 2023-11-22 09:58:29.369; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Completed metrics collection
INFO - 2023-11-22 09:59:29.369; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Beginning metrics collection
INFO - 2023-11-22 10:00:06.842; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Completed metrics collection
INFO - 2023-11-22 10:01:06.842; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Beginning metrics collection
INFO - 2023-11-22 10:01:41.788; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Completed metrics collection
INFO - 2023-11-22 10:02:41.788; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Beginning metrics collection
INFO - 2023-11-22 10:03:22.174; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Completed metrics collection
INFO - 2023-11-22 10:04:22.174; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Beginning metrics collection
INFO - 2023-11-22 10:04:57.249; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Completed metrics collection
INFO - 2023-11-22 10:05:57.250; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Beginning metrics collection
I was able to take a heap dump, using the jattach
utility (awesome it's packaged with the container image, thanks for that!), but I guess I don't really know how to properly read it .. it says that the heap size is only 23549096B big ... which is 23.549096 MB? That's not so much.
Yep, that's 23MB. Weird that it takes a while to collect metrics, is that a symptom (e.g. of the Exporter stuck in GC, then it doesn't have spare CPU to collect the metrics) or a cause (e.g. you have a ton of shards in the cluster, collecting them takes a while and takes heap)?
Maybe G1 falls behind with garbage collection? You can verify this hypothesis by setting the GC_TUNE
env var to -XX:+UseG1GC -XX:GCTimeRatio=2
. Unless you have a ton of shards, I'm expecting something like JAVA_HEAP=1g
to be enough. Or maybe we're both missing something...
The cluster is not big at all I think, 1 shard, 2 replicas, ~ 8753202 documents, taking ~22Gb of memory ...
Thanks for hints, I'll take a look on Java metrics and how GC performs.
You're welcome.
If you need something to monitor GC/JVM metrics (and Solr metrics, for that matter), we have a tool that you might find useful.
Hello, we have trouble with solr exporter, it's very hungry for memory, it needs around ~6G of RAM, which is a lot and I can't figure out why.
Can I ask you any hint?
It's pretty much default setup, nothing custom:
SolrCloud 9.3.0. Nothing too much custom for the exporter deployment: