These links are outcome of 4+ years of tuning/running our ES clusters (on premise and in a cloud).
This list is not in the active development. Partially merged into https://github.com/dzharii/awesome-elasticsearch
A Dive into the Elasticsearch Storage
In this article we'll investigate the files written to the data directory by various parts of Elasticsearch. We will look at node, index and shard level files and give a short explanation of their contents in order to establish an understanding of the data written to disk by Elasticsearch.
Tuning Garbage Collection for Mission-Critical Java Applications
JVM Garbage Collector settings investigation
Comparison of jvm gc. Fantastic job!
Garbage Collection Settings for Elasticsearch Master Nodes
Fine tunine your garbage collector
Understanding G1 GC Log Format
To tune and troubleshoot G1 GC enabled JVMs, one must have a proper understanding of G1 GC log format. This article walks through key things that one should know about the G1 GC log format.
How to start using G1
#ES_JAVA_OPTS=""
ES_JAVA_OPTS="-XX:-UseParNewGC -XX:-UseConcMarkSweepGC -XX:+UseG1GC"
In this post, we’ll explore Elasticsearch’s behavior under various types of network failure.
Call me maybe: Elasticsearch 1.5.0
Data-loss scenarios
How to achieve transactions in Elasticsearch?
Elasticsearch Refresh Interval vs Indexing Performance
Because refreshing is expensive, one way to improve indexing throughput is by increasing refresh_interval. Less refreshing means less load, and more resources can go to the indexing threads. How does all this translate into performance? Below is what our benchmarks revealed when we looked at it
A-Z Guide on Scaling Elasticsearch
In this article we will discuss the system settings in detail. This will guide you on the parameters and values to be considered in various levels including the operating system (we are considering the Unix-based systems here). Focus will also be given to the memory settings in Elasticsearch, and we will look even deeper into the heap memory management and fine tuning of the same.
Top 10 Elasticsearch Metrics to Watch
This should be especially helpful to those readers new to Elasticsearch, and also to experienced users who want a quick start into performance monitoring of Elasticsearch.
How to monitor Elasticsearch performance
Very good article from Datadog
Good checklist (with the explanations)
Nice check list
Choosing a fast unique identifier (UUID) for Lucene
If have your own natural ID for each document, try to pick an ID that is friendly to Lucene.
In order of my personal preferences