Open Arpit-Bandejiya opened 1 year ago
@Arpit-Bandejiya do we when this will be release in OS? is it v2.4? Thx
@Arpit-Bandejiya Does it produce any logs or errors during the failure?
Also, when you say latest version, do you mean main
branch of OpenSearch or the 2.3 release?
@Arpit-Bandejiya can you provide insight on what the fix here? And if our sample data is failing could it be possible that others will experience this in other ingest software? Therefore, is this not technically a breaking change and it should a 3.x change?
@Arpit-Bandejiya can you provide insight on what the fix here? And if our sample data is failing could it be possible that others will experience this in other ingest software? Therefore, is this not technically a breaking change and it should a 3.x change?
Replica count enforcement is done only when cluster.routing.allocation.awareness.balance
is enabled. This feature is disabled by default. Hence it is not a breaking change.
@Arpit-Bandejiya Does it produce any logs or errors during the failure? Also, when you say latest version, do you mean main branch of OpenSearch or the 2.3 release?
error reponse:
{
"error" : {
"root_cause" : [
{
"type" : "invalid_index_template_exception",
"reason" : "index_template [template_1] invalid, cause [Validation Failed: 1: expected total copies needs to be a multiple of total awareness attributes [3];]"
}
],
"type" : "invalid_index_template_exception",
"reason" : "index_template [template_1] invalid, cause [Validation Failed: 1: expected total copies needs to be a multiple of total awareness attributes [3];]"
},
"status" : 400
}
This feature is present in main as well as in 2.3 release
Triage - does the sample data need to support any combination of user settings/options (what's the purpose and use-case for sample data)?
UX: should sample data support non-default cluster configurations?
@Arpit-Bandejiya can you describe more on what the fix should be? Does the sample data indices settings need to have a dynamic way for specifying certain settings fields, such as auto_expand_replicas
? From the error messages, it seems it will need to be dynamic based on the cluster's total awareness attributes
Update - I've learned the key cluster setting to be aware of is the max awareness attribute value, of which there could be multiple (AZs, rack IDs, etc.). The upper limit of auto_expand_replicas
must be a multiple of that. Note that this setting by default does not take into account awareness attributes. From documentation:
Note that the auto-expanded number of replicas only takes allocation filtering rules into account, but ignores any other allocation rules such as shard allocation awareness snd total shards per node
Because of this, if cluster.routing.allocation.awareness.balance
is set to true
, and a user ingests sample data, there is no current way (I believe) to easily read the total awareness attribute value and update the index setting before index creation, and so the ingestion may fail if the replica count isn't a multiple of the max AZ count.
Maybe just adding documentation around this setting is sufficient. @gbbafna can you point me to the current documentation for this setting? I can't seem to find it in the OpenSearch docs.
I will defer the decision to the feature owner and Dashboards team for deciding on the path forward. From a plugin owner perspective, it is more logical and maintainable to maintain the same sample data index configuration as that of core Dashboards, and so I will work on a fix in the AD plugin to consume such settings.
Hi @ohltyler : Please find the documentation in https://opensearch.org/docs/latest/tuning-your-cluster/cluster/ . Search for Replica count enforcement
in here.
We have also added default_replica_count
as a cluster level setting : https://github.com/opensearch-project/OpenSearch/pull/5610/ . For sample data, it should be fine to use that instead of using auto expand replica at all . Using that , AD won't need to bother about all of the cluster settings used as well .
Yes- totally agree. Thanks for providing this option!
We can eliminate this setting and consume cluster defaults. I will work on making that change on the AD plugin side.
Update: AD-related changes have been merged & backported - see https://github.com/opensearch-project/anomaly-detection-dashboards-plugin/pull/423
Describe the bug
Due to the changes done here in Opensearch : https://github.com/opensearch-project/OpenSearch/pull/3462/files#diff-013717f93370bf1d9635d1b84aee81e7e003e3fd6c6bb7c74b9890a1327a04b6
We are seeing that the sample data creation is failing due to low replica count
To Reproduce Steps to reproduce the behavior:
cluster.routing.allocation.awareness.balance
in the opensearch.yaml file to enable the feature).Expected behaviour We should be able to create the sample data from the dashboard.
OpenSearch Version latest version
Dashboards Version Any dashboard version supported
Plugins
Please list all plugins currently enabled.
Host/Environment (please complete the following information):